


Vol. 40 MARCH, 1949 No. 3 


The Journal of Educational 
Psychology 


Devoted Primarily to the Scientific Study of Problems of Learning and Teaching 





CONTENTS 


Applications of the Simple Method of Factor Analysis . ... . . 129 
KARL J. HOLZINGER 


SRS ok SoS 6 gle RE ” Batre. 4 et > Nea 143 


FRANCES SWINEFORD 


The Validity of University Counselor Self-ratings ........ 168 
A. J. DRUCKER AND H. H. REMMERS 


A Scientific Appraisal of Professional Education for Business. . . 174 
DONALD K. BECKLEY 


FP Pe er eee etree ee eS eT 189 


$6.00 per Year - Published Monthly October to May 


WARWICK & YORK, INC. 


BALTIMORE 2, MD. 
Entered as Second Class Matter Nov. 15, 1921, at the Post Office at Baltimore, Md. 
under the Act of ogqod & SABA; FdMiti@aFen th} he ebbGAACRbs Matter at York, Pa. 
THE HORACE H. RACKHAM 
SCHOOL OF GRADUATE STUDIES 
ANN ARBOR 
INSTITUTE FOR HUMAN ADJUSTMENT 


SresacH CLINIC 
10607 BAST MURON sSTREEV 








THE JOURNAL OF 
Educational Psychology 


Established 1910 


EDITORIAL BOARD 


STEPHEN M. Corey Rosert T. Rock, Jr. 
Teachers College, Columbia University | Fordham University 
—Learning— —Individual Differences— 
Jacx W. Duntap PercivaL M. Symonps 
152 W. 42nd St., New York 17, N.Y. Teachers College, Columbia University 
—Technical, Statistics— —Mental Hygiene— 
Karu. J. Ho.zincer Mies A. TINKER 
University of Chicago University of Minnesota 
—Factor Analysis— —Reading— 
Harotp E. Jones ALEXANDER G. WESMAN 
University of California The Psychological Corporation 
—Soctal Behavior, Child Psychology— —Tests— 
H. H. RemMMeERs Paut A. Witty 
Purdue University Northwestern University 
—Attitudes, Teacher Evaluation— —Children’s Interests— 
H. E. BucwHouz 
Managing Editor 
10 E. Centre St., Baltimore 2, Md. 


as JournaL oF EpucaTtionat PsycHo.ocy is devoted pri- 
marily to the scientific study of problems of learning, teaching, 
and measurement of the psychological development of the indi- 
vidual. The Journat will contain articles on the following sub- 
jects: the psychology of school subjects; experimental studies of 
learning; the development of interests, attitudes, and personality, 
particularly as related to school adjustment; emotion, motivation, 
and character; mental development and methods. This last will 
include tests, statistical techniques, and research techniques in 
cross-sectional and developmental studies. 


Manuscripts may be submitted to any member of the Editorial 
Board, but the handling of an article will be facilitated if it is sent 
to that member of the Board who is designated as particularly 
interested in the phase of psychology dealt with. (Such designations 
appear after the Editors’ names in the list given above.) 

Books and other materials for review and correspondence 
regarding editorial matters should be addressed to The Journal 
of Educational Psychology, Warwick and York, Iinc., Publishers, 
10 E. Centre St., Baltimore 2, Md. 


Manuscripts should be typed and double-spaced throughout, 
including quotations, footnotes, and references. In order to attain 


(Continued on Inside Back Cover) 








THE JOURNAL OF 
EDUCATIONAL PSYCHOLOGY 








Volume 40 March, 1949 Number 3 








APPLICATIONS OF THE SIMPLE METHOD OF 
FACTOR ANALYSIS 


‘ KARL J. HOLZINGER 
The University of Chicago 


This is a simple exposition of certain concepts in oblique factor 
analysis, a comparison of two ideal types of test configuration, 
and the application of the Simple Method of Factor Analysis to 
problems where the clustering of test vectors warrants such a 
technique. 

For the students of factor analysis who have read Professor 
Thurstone’s recent book on Multiple-Factor Analysis, my Student 
Manual, and Factor Analysis by Holzinger and Harman,' the 
present discussion should be unnecessary. If any small number 
of such persons exist they may be spared reading further. In 
recent years, however, a number of my students and two foreign 
mathematicians have failed to note some of the simple relation- 
ships I shall point out. Perhaps it is the simple things that are 
overlooked more than the complicated mathematical ideas, or 
perhaps everything is really simple when viewed in proper per- 
spective. In any event this paper is written in full realization 
of the fact that it adds nothing essentially new to the factor 
problem. 

Some knowledge of statistics and factor analysis is assumed on 
the part of the reader, but this article is designed primarily for 
the psychologist who interprets factor analyses rather than for 
those who apply the methods. Therefore, I have tried to explain 





1L. L. Thurstone, Multiple-Factor Analysis. Chicago: University of 
Chicago Press, 1947. 

Karl J. Holzinger, assisted by Frances Swineford and Harry Harman, 
Student Manual of Factor Analysis. Chicago: Statistical Laboratory, 
Department of Education, University of Chicago, 1937. 

Karl J. Holzinger and Harry H. Harman, Factor Analysis. Chicago: 
University of Chicago Press, 1941. 


129 








130 The Journal of Educational Psychology 


and interpret such concepts as ‘structure’ and ‘pattern’ in as 
simple a way as possible, rather than to review merely formal 
definitions or assume that the reader is familiar with the more 
formal discussion of these topics in my books. 


1.—REVIEW OF STRUCTURE AND PATTERN CONCEPTS 


The two matrices necessary in all oblique factor analysis I have 
designated as ‘structure’ and ‘pattern.’ In order that these con- 
cepts may be clear to the reader, these terms will be redefined, 
simple analytic examples will be given, and, finally, a geometric 
interpretation will be made. 

A factor ‘structure’ is a set (matrix) of correlations between 
tests and factors. <A ‘pattern’ is a matrix of coefficients in the 
linear expressions defining tests in terms of factors. For ortho- 
gonal factors these two matrices are identical, but for oblique 
factors they may be very different in value, and they have 
different uses in analysis. It is for these reasons that the two 
concepts were originally defined. 

With no indication here as to how they were obtained, the 
structure and pattern matrices for three tests will next be set 
down for interpretation. 











Structure Pattern 
(Correlations) (Coérdinates) 
Test 
Tir, Tir, Fy F, 
ere ci eT saree ee .90 .60 8 2 
Mie aeusee el eee be .85 .65 ee 3 
SS ee ee ae .95 .55 9 op 

















Considering only test 1, the structure values .90 and .60 are 
interpreted as the correlations between the test and the two 
oblique factors F; and F2, respectively. The two pattern values, 
8 and .2, are coefficients in the equation, 


2’ = SF, + 2F 2 + AU, (1) 


where U; is a ‘unique’ factor assumed to be uncorrelated with 








Applications of the Simple Method of Factor Analysis 131 


F; and F», and 2,’ is used to designate the test in the total-factor 
space. The correlations such as .90 and .60 are useful if the 
analyst wishes to estimate factors from tests, but they do not 
give a clear indication of the ‘saturation’ of tests with factors. 
The pattern matrix is useful for this purpose. From the coef- 
ficients, .8, .2, and .4, it can be determined that test 1 is most 
highly saturated with factor F;, next, with U;, and least, with F». 
The proportions of saturation, however, vary with the squares of 
these numbers and an additional term to be described next. 

In the illustrative problem above, two hypothetical factors, 
F, and F2, were assumed to have a correlation of exactly .5. It 
was also assumed that all variables—z,’, F1, F2, and U;—have 
unit variance. The contribution of the three factors to the unit 
variance of z;’ may then be obtained by squaring both sides of 
equation (1), summing, and dividing by N, as follows: 


D(z1')? " =F? >F.? DF iF 2 >U;? 
“a = N + .04 N +2xX< 8X 2 N + 16, 


all other terms being dropped because U is assumed uncorrelated 
with F; and F:. This last equation reduces to 


on’ =1= 64+ 044+ .16 + .16 (2) 





because all variables are in standard form. It is then possible 
to say that 

(1) Sixty-four per cent of the variance of z;’ is attributable 
directly to factor F, 

(2) Four per cent of the variance of 2,’ is attributable directly 
to the factor F2, 

(3) Sixteen per cent of the variance of z;’ is attributable to the 
joint influence of factors F; and F2, and 

(4) Sixteen per cent of the variance of z,’ is attributable to the 
factor U;, which is unique for variable 2’. 

Although the entries in the structure matrix give some indi- 
cation of the saturation of tests with factors, no such precise 
analysis as that just made is possible from that matrix (except, 
of course, for uncorrelated factors, where structure and pattern 
are the same). 

Figure 1 has been made to illustrate the above concepts for 
the case of test 1. Inasmuch as the correlation between two 
tests or factors (in standard or deviate form) is given by the 








132 The Journal of Educational Psychology 


cosine of the angle between the two vectors, the correlation 
between F, and F; is .5 for the 60° angle chosen for illustration. 
The vectors OF, and OF; have unit length. The unique factor 
vector OU, is to be imagined drawn perpendicular to the plane 
of the paper at the origin. Test vector z,’ also has the same 
length as OF; and OF», but it is above the plane of the paper in 
the space determined by the origin, F:, F2, and U;. For present 
purposes it is convenient to project the vector Oz,’ on the plane 
of the paper and designate the shortened vector as Oz,’’.. This 
projection is now thought of as a geometric picture of the ‘com- 
mon factor’ portion of the test 2’. 

The sum of the first three terms in equation (2) gives what is 
known as the ‘communality’ (h;’) of test 1, here equal to .84. 
The square-root of .84 (= .92) is the A; which is indicated geo- 
metrically by the length of the projected vector Oz,’’. 

Dealing first with the geometric interpretation of the structure 
values .9 and .6 for test 1, it will be observed that these are 
shown by the line segments OP and OQ, respectively. The 
value .9, for example, is the cosine of the angle POz;’. This 
angle is determined by the origin and P in the plane of the paper, 
and by the point z,’, which is .4 units above the plane of the 
paper at z,”. A similar interpretation may be made for the 
structure value .6. 

- Turning to the geometric interpretation of the pattern values 
£8 and .2 for test 1, it will be noted that these are given by the 
line segments Nz,’ = OM and Mz,” = ON, respectively. These 
are known as the ‘oblique coédrdinates’ of the point z;’’ in the 
common-factor space. All such coérdinates are distances meas- 
ured parallel to axes indicated above. The point z;’ has the 
additional coérdinate of .4 along the OU, axis. 

If the reader has a little difficulty with these geometric ideas, 
he may be aided by the following simple device. Measure a 
piece of thread with a distance equal to OF; between two knots. 
Place one knot at the origin and hold the other knot in space 
so that it comes directly over the point z;”’ in the diagram. The 
projection Oz,’ can be seen by squinting directly down on the 
thread above the knot. The projection OP may also be noted 
by holding the knot steady and moving the eye back a few inches. 

It is also suggested that the reader verify the structure values 
for tests 2 and 3, assuming the pattern values and the correlation 








Applications of the Simple Method of Factor Analysis 133 















"Cluster" 





Fig.& 








134 The Journal of Educational Psychology 


of .5 between factors F; and F2.! Diagrams similar to Fig. 1 
may also be prepared. 


2.— CONFIGURATION TYPES 


The test configuration is the set of n test vectors at a common 
origin. If subgroups of tests form clusters as illustrated by the 
ideal spherical model in Fig. 2, the factors F:, F2, and F3 would 
form the basis for a parsimonious solution of nine tests in terms 
of three oblique factors. The three tests near F; have such small 
coérdinates with respect to the OF: and OF; axes that the 
common-factor portion for these three tests would have an 
oblique factor pattern approximately as follows: 


21’ = .99F ; +t. OF. +. OF 
Zo" = .99F 7; + OF 2 + OF 
Zs = .99F; + OLF 2 _ OF; 


The three tests 1, 2, and 3 are thus nearly pure measures of the 
factor F;, which I have called the ‘uni-factor’ ideal in factor 
analysis. The remaining two groups will also be nearly pure 
measures of the factors fF: and F3. The number of common 
factors in a pattern gives the ‘complexity’ of a test. In the 
above model the complexity of all tests is approximately one. 

With actual data the clusters may not be so compact as in 
Fig. 2, but with a well designed test battery a good approxi- 
mation to the above ideal is possible. In fact, all the analyses 
made at the Statistical Laboratory during the last fifteen years 
conform to this model, with only minor variations. (All clear- 
cut bi-factor patterns reduce to uni-factor oblique solutions. ) 

It may happen, of course, that some test batteries do not con- 
form to the type illustrated by Fig. 2 without a good deal of 





1 The calculations for test 1 were done as follows: 
Assume the pattern z;’ = .8F; + .2F; + 4U, and rr,r, = .5. 
Structure values were calculated by multiplying both sides of this 
equation by F; and then by F2, summing, dividing by N, and reducing: 


zz'Fi _ ¢ ze 4 9 2 + zero terms 





N 
Fi = 8+.2X 5 =.9 
>z:'F 2 ms =F iF, =F? 
— © 8—\- aa 2 + zero terms 


r,'F2 = 8X .5+.2 = 6 








Applications of the Simple Method of Factor Analysis 135 


preliminary analysis that results in improving certain tests and 
dropping others from the battery. In our own analyses we have 
eliminated tests such as those measuring ‘perseveration’ and 
‘oscillation’ either because they did not appear to measure much 
of anything or because they were too difficult to administer to 
groups of subjects. Only rarely has a test turned out to have a 
complexity greater than one with respect to oblique factors. 

Instead of depending solely on clusters for an ideal solution, 
Professor Thurstone has given his concept of ‘simple structure,’ 
illustrated by Fig. 3 taken from his recent text.' The ‘primary’ 
factors are now A, B, and C, and they are obviously not deter- 
mined by clusters in this diagram. The complexity of all tests 
in Fig. 3 is two, because tests 1, 2, and 3, for example, will each 
have a sizeable codrdinate with respect to both the OA and the 
OB axes, with similar complexity for the other six tests. Whether 
or not actual test batteries conform even approximately to the 
ideal form shown in Fig. 3 is a matter for the individual analyst 
to decide. In the language-arts field, Dr. Chester Harris believes 
that numerous test complexities suggest Fig. 3 as a better model 
than Fig. 2. If so, he should use model 3 and the appropriate 
methodology that accompanies it. If model 2 is appropriate, 
as it is in my own work, then very simple cluster methods of 
analysis suffice. 

In recent correspondence with Dr. Harris, we have agreed upon 
a uniform notation for discussing oblique factor analysis. There 
is also obvious agreement on another point: the three factors in 
Fig. 2 are both ‘cluster’ and ‘primary’ factors in Thurstone’s 
sense. There are, of course, innumerable variations of Fig. 2 
and 3. Dr. Harris and I have agreed to designate the resulting 
factors as ‘cluster’ factors if they are obtained by ‘cluster’ 
methods alone, but to designate the factors as ‘primary’ if the 
test configuration resembles Fig. 3 to some extent, at least, and 
to employ appropriate rotational methods. 

The remainder of this article will be devoted largely to two 
actual examples illustrating the applicability of my ‘Simple 
Method of Factor Analysis” to batteries of tests that conform 
approximately to the ideal of Fig. 2. The point of the illustra- 
tions is merely that if the tests do cluster in subgroups, a very 





1 Op. cit., p. 350. 








136 The Journal of Educational Psychology 


simple objective method will give the same solution as that 
obtained by first getting the centroid pattern and then employ- 
ing many rotations to obtain a ‘psychologically meaningful’ 
oblique solution. 


3.—OBLIQUE FACTOR SYSTEMS 


In determining the weights of his ‘primary’ factors, Thurstone 
uses a system of axes perpendicular to the planes defined by the 
primary axes shown in Figs. 2 or 3. These new axes I have 
already designated as ‘simple’ axes in Student Manual, so this 
term will be retained. 

The matrix used by Thurstone in identifying the primary 
factors is the structure (in my sense) of the simple axes; that is, 
correlations between tests and the simple factor axes. The 
matrix I would use in identifying cluster or primary factors is 
the cluster factor pattern or the primary factor pattern. Either 
of our approaches gives zeros in the same places in the matrices, 
as illustrated in the Student Manual. The proportionality of 
the corresponding entries in P and V is shown by Thurstone.' 

Whether an analyst is employing ‘cluster’ or ‘primary’ oblique 
factors, a pattern and a structure are needed; the former for the 
determination of test saturation with factors and the latter for 
the estimation of factors. Only one oblique factor system is 
needed, however, such as that shown in either Fig. 2 or Fig. 3. 
Since Thurstone has used what I call ‘simple’ axes to identify 
the primary factors, this system will need to be added for clarity. 
A corresponding set of oblique axes perpendicular to the planes 
defined by the ‘cluster’ factors will be designated here as ‘normal’ 
factors to complete the picture. 

In the table of notation which follows I have retained S and P 
for ‘cluster’ structure and pattern matrices, and other symbols 
are chosen as nearly like Thurstone’s as possible. Using the 
subscript p to denote ‘primary’ factors, the notation may be 
summarized as follows: 


Cluster Normal Primary Simple 


factor factor factor factor 
Structure............ S V Sp V> 
Patterm............. P W P, W> 





1 Op. cit., p. 353. 








Applications of the Simple Method of Factor Analysis 137 


By a simple proof resembling that given by Thurstone, a rela- 
tionship between W and S similar to that between P and V may 
readily be obtained, giving the following equations for the cluster 
and primary factor systems: 


PD=V P,D, = V> (3) 
WD=-8 wW,D,-&, 


where D is a diagonal matrix giving the intercorrelations between 
corresponding cluster and normal factors, and D,, the correla- 
tions between primary and simple factors. 

If the analyst is interested only in the identification of factors, 
then only the matrix P in the cluster method, or only the matrix 
V, in the primary method is needed. In case the cluster and 
primary axes are the same, as in Fig. 2, the first of equations (3) 
shows that either P or V identifies the factors equally well, 
although the entries in the P matrix are numerically larger 
because the correlation entries in D are less than unity. 


4.—NUMERICAL EXAMPLES 


The first example is Thurstone’s twenty-one-variable problem. ! 
Table 1 shows Thurstone’s solution designated here as V, 
obtained by first getting a centroid pattern and then employing 
rotational methods to determine an acceptable V, matrix to 
identify the factors. Inspection of the entries in this table shows 
that the great majority of tests have only one large entry in each 
row, so that the corresponding pattern would then reveal a com- 
plexity of approximately one for all such tests. 

In applying the simple method? it is necessary to section the 
correlation matrix according to the groupings already identified 
by Thurstone, and insert estimates of the communalities. In 
the present case we used Thurstone’s centroid communalities 
because they were conveniently available. The remainder of the 
calculation for S involves merely the application of the centroid 
method (first centroid only) to various sections of the matrix. 
The entries on the left side of Table 2 are the correlations between 





1L. L. Thurstone and Thelma Gwinn Thurstone, Factorial Studies of 
Intelligence, p. 91. Psychometric Monographs, No. 2. Chicago: Uni- 
versity of Chicago Press, 1941. 

* Karl J. Holzinger, ‘‘A Simple Method of Factor Analysis,” Psychometrika, 


Ix (December, 1944), 257-62. 








The Journal of Educational Psychology 


138 















































ss |o0 |10° Jer |0'-|90'-| 90° | 68: | 00° | 10° | 407 | 80'- | LO | LO” | 
 to~ler—ier ia ie ia |e i@-l|@- 12-12 12 12 ts 
 6le0 [oo {so | go | zo |eo-| ts | 20° | 80° | 00° | 90'— | TO" | OL | OF 
or «| se (loz «| t0'— | 90° | go'—|so'- | ot | se | ot: | 20: | 00" | 90'— | 90 | BF 
a. 130° |oo—|t0 |eo'-|a° | 10° |90'-| 82 |et-- | to" |e - | | et 
a io io te-ia iw |e |e-|e (w-j|e- |e | (ei ® 
o-lieo-la leo-io io jo |w-j|o-le |o- |e ie le |e 
mo'— |90°— | 92. | 10° | go°— | zo'— | co" | 10'— | 90'— | vL° | 00" | TO | HO~ | | OP 
D |o0° 199° | 80° | 10°-—|00 |%0°'-| 80 | 60 | 99 | 80° | W— | 1 | w—| oI 
~~ io le-iow ia io io jolie |a-ie [eis is 
mo |o'- | 80° | 19° | 10'— | to°— | zo°— | 90° | 80° | 80° | 19° | 80'- | ~~ | sO | Tt 
to'—- | 20°— | zo'— | eo: | 80° | oo jer [oo | z-|a-|oo | a- | 1- |) wo | OF 
1 loo lor |t0—| 2° |oo | -|eo- | a- | st: joo |i joo” |e 
 6dl go (Uf e0'— | go" | 99° | zo" | 10°— | 90'- | 90° | 7'- | 90° | 29° | 40 | eo | 8 
eT 6110 |eo—-|eo0-—|90° |90-—|00 | 20 |e0-|e°-|90- | | O- |W | 4 
so'— |er: | 10° |e0'- | 00° |e |o: | 60°—| 80° | 10'— | 10'— | Zo" | 09° | 20'— | 
i lor-lor |e |@-l|te |o js |a-|so | - | mO- is jee 
or (leo. | g0'— | wo. «| 00" «| es" «| co | m0 «6] Go 6] 80—- | e0 | tO | ee | BO | 
o le lor |e jw-lw-ie jo |u| jt jem las le 
oo «€6lat (foe 6} 90. | 90° | 90 | oe «| oo | eo- [et jst (oo Tm le 
- lor | 20-190 | a'-—|90-| a | 60'- | or | et'— | - | co - | 9 - |e | I 
a N S M A W d a N s M A W ae 
“LUBA 


¢4 ‘ammponiys afdunig 





A ‘oinjonsys [BULION 








aIdNvVxXy AIAVIUVA-ANO-ALNGMT, S ANOLSHOH YT, wod 


SHNIVA AUNLOAULG— | AAV, 








139 


Applications of the Simple Method of Factor Analysis 















































02° |00° |zo° jot | it'— | 60°— | at “Se an £2e as =e Ee es 
g9° |o1'— |s0°'-— | 80°'- | 8t° | 80° | 40° on | ev | 2 | Se | O98 | Sh | O98 | o 
ze Ci} or’) =| 80°) =| 1O°— | 80°— | 10” s'/| a eg° | oo | se | oF 7 eo | ae | eo | 6t 
% (|99 |9t | 0° | 00° | 20°- | 60° go’ | oz: | ze’ | i | i | og | go | gt 
or— | 28° | st'—| 10° | e0°'- | 10° | oF ey | zs’ | st’ | ee | ve | ge | og | at 
r1'— | 28° | 10°- | S0°'-— | 80° | 90° | 10° Si we ae we ee se ee 
'-— |20-—|18° |90°'- | 40° | 90° | 80° ez° | oc’ | we | ot’ | st’ | os | gh | ot 
zo’— | 20'— | 28° | 10° | 10°— | 40°—"| 40° tae) a ae Fe 2 ae ae hae 
co 6] er Cf ce’ | HO") S| 8O°'— | TO'— | at et ao ae ee ee aoe eS 
or'— | 00° |s0'-|z9° |9t | 0° | g0° rin lin iwiei¢gig ia 
60 «6 90°) «OC Or’ Cf gs" | @t— | zo'— | et" ae a Se Ss) | A |) 
10° | 80°'— | zo - | 8° | 0°- | 10°— | 20° oc | zm | or | #° | oF | ua Cl 6Sh C| OCOT 
10’ -— | eo'- | st’ |00° | 68° | 00° | 10° eo’ | Im | 62 | #9 | s8° | a | ww | 6 
or'— | 80° |¢0°'— | 90° | 06° | 60° | 90° o9 | sh | at | 6s | 16 | 2 | H | 8 
et’ |so'-— | 11'— | 90°— | 06° | 8o°— | 20° 9° | o | or | If | 1 | ze | er | 2 
91°— | zt’ | 10'- | 10'-— | 80° | 92° | 60° oe | wz | #1 | of | 82° | go | we | 9 
60° |¢o2'-|or |#'-— | 10 -—| 1% | Fe eo’ | #t' | az | oe «| gel Cl 68h Ud] Ce Ud 
9° |e |60-—|90° |10-—| 29° | T° ey | te | 60° | ee | se | so | os | > 
91° | 2t'-— | zo | 2° | 90°- | zo'- | 19° co’ | w | 98° | I9° | 8e° | se | oo | 
00° |20'-| I | 1%-|60° | 60° | 92° @Btaen wet 2k Bt ee ae 
91°'— | %%° | 9t'— | Jo'— | 2O°— | 90'— | FL" 22 2 Be ae ee ee eS 
Y N gs M A W d Y N s M A W ae 
“118A 


d ‘u19}48g J94sNIO 





g ‘einjgoni4g JoysnIO 








GIdNVXYAIAVINVA-ANO-ALNGUMT HO NYUALLVG GNV GTHOLOAULS UALsaAT)—'Z ATV YE 








140 The Journal of Educational Psychology 


tests and ‘cluster’ factors. The cluster pattern P is calculated 
by a routine form given in Holzinger and Harman! to complete 
the cluster factor solution. Test saturations with factors may 
be accurately determined as illustrated in Section 1. The num- 
bers in the P matrix are all larger than the corresponding entries 
in Thurstone’s V, matrix of Table 1. If the ‘cluster’ and 
‘primary’ factors were identical, then the columns of P would 
be exactly proportional to those of V,, the factors of proportion- 
ality being the correlations in the diagonal matrix D, as already 
indicated by equations (3). 

The matrix V has been added to Table 1 merely for con- 
venience of interpretation.? Again, if the cluster and primary 
axes were identical, V would be exactly equal to V,. The small 
variations in these two matrices are due to the slight differences 
in the two factorial systems. It is remarkable that the agree- 
ment between V and V, is so close when it is considered that 
Thurstone’s rotational methods involve some subjectivity but 
that the Simple Method leading up to V is wholly objective 
once the tests are grouped and communalities are inserted in 
the diagonals of the correlation matrix. I would accept P, V, 
or V, as excellent primary factor solutions. I would prefer P 
because it is so simple to obtain and also because it yields a 
pattern revealing precisely the saturation of tests with factors. 

The second example is a thirteen-variable problem from my 
text. The Simple Method yields the columns headed S, P, and 
V of Table 3. Again V is introduced only for comparative pur- 
poses since the solution is complete with Sand P. Dr. Swineford 
computed the last column, V,, by Thurstone’s method of 
extended vectors, using an initial centroid pattern.4 Inasmuch 
as the largest numerical discrepancy between the high entries in 





1 Op. cit., pp. 386-89. 

2 The normal factor structure V was computed by Dr. Frances Swineford 
from the relationship V = PD = P[T’A], where 7 is the matrix of the 
direction cosines of the factors of P referred to any orthogonal system of 
vectors. One such matrix is that suggested by Thurstone (p. 174)—a 
diagonal factor pattern for ¢, the intercorrelations of the cluster factors. 
The inverse [7’]~' of this factor matrix is then normalized by columns to 
become A. The inverse is easily computed because all the entries above 
the diagonal in 7’ are zero. 

* Holzinger and Harman, op. cit., p. 30 (Tests 1-13 only). 

‘ Tbid., p. 189. 











141 


le Method of Factor Analysis 


imp 


tons of the Si 


Applicat 


























eo" LO°— | 62 1g" cO'— | 82° 09° LO— | 98° aL" oP 8g" eI 
19° 91°'— | 20° 19° cI'- | 20° LL 61°'— | 60° el om Ze ZI 
8g" 40° zZO'— | 9g" Or’ 40°— | $9 €1° 80° — 19° 8¢ 82" Il 
OL" Or" 6°— | 12° IT" 6°—- | ss" tI 1g" — $l 0g" 80° Or 
90°— | 12° s0°— | 60°— | 22° co— | O1'— | 86° 90° — 0g" 8° bP 6 
60° cr ae OI" eh 1° IT’ cg’ 91° bP OL" rs 8 
00° 69° 1'- | 8O'- | TL’ 90°'— | eo°-— | 2&6 1) i cg" 3 L 
co’- | 2° sO" co'— | &9° £0" 90°—- | I8° 0° re 18° 6h" 9 
60° 19° €0°— | 20° Z9° 10° — | 80° 08° co" — eh 18° oF ¢ 
10° 90° It £0°'— | 60° OF 10°— | ZI 1g $Z" OF 9g" 7 
I'—- | 10°— | eo" II'— | 00° 1g" ZI°—- | 00° cg LI so 09° g 
ZO" 00° i 10° 10°— | 98° 10° zO'— | OF 12" 9" os z 
ae 80°'— | 69° 1° 80°— | 8g cl Ol'— | FL a OF cL I 
poedg | reqie, | penedg | poodg | equa, | [eyedg | paodg | jeqia, | feyedg | poodg | jeqie, | [eyedg 
- a[qeLe A 











#4 ‘ainjoniyg oidurig 





A ‘ainjoniyg [eulION 





d ‘W19448g JoysnyO 





g ‘ainjponiyg JeysNy[O 








GIdNVXT ATAVIUVA-NGGLYIBT, HOd SADTVA NUDLLIVG AGNV GTHALOAYLS—’E WIAV], 








142 The Journal of Educational Psychology 


V and JV, is only .02, cluster and primary factors are obviously 
identical. 

By way of summary, if the clustering of tests suggests a model 
approximating Fig. 2, then the Simple Method gives a direct 
solution in which P (or P, since the solutions are identical) gives 
an exact basis for evaluating factor saturations. The coefficients 
in P are coérdinates in the linear expressions relating tests and 
factors, and the squares of these weights show the relative 
importance of the various factors, as shown in Section 1. 

In the case of Thurstone’s V, matrix, used to identify what 
he calls ‘simple structure,’ the entries by columns are propor- 
tional to those in P,. Of course, either P, or V, will identify 
Thurstone’s simple structure model equally well, but I prefer P, 
for the reasons given in the preceding paragraph. 

Personally, I vote for Fig. 2 as a more probable and useful 
model in test analysis than Fig. 3 on the basis of analyzing many 
large batteries of tests by all factorial methods. This is intended 
as a statement based on the experience in one laboratory. If 
other test materials in other laboratories fit the model of Fig. 3, 
with several complexities of at least two, then I should recom- 
mend Thurstone’s method of analysis or Harris’ recent adapta- 
tion! of the Simple Method where no clusters occur. 

Finally, if Fig. 3 were found in extreme form in a battery of 
tests, I should be very unhappy, because all the tests would 
involve at least two common oblique factors. If, on the other 
hand, the tests fitted, Fig. 2, the effect would be very pleasing, 
because of the uni-factor nature of the solution, which I regard 
as the ideal model for factor analysis. 





1 Chester W. Harris, Further Application of the Method of Direct Rotation. 
(Unpublished mss.) 











A CLASS OF METHODS FOR ESTIMATING 
REACTION TO STIMULI OF VARYING 
SEVERITY* 


PHILIP J. McCARTHY 


Social Science Research Council 


I, INTRODUCTION 


There are examples in many fields of experimentation in which, 
after an object has been subjected to a stimulus of specified 
severity, it is observed only that the object reacts in one of two 
ways. For example, in biology an animal may be given a speci- 
fied dose of a drug and one may only observe that the animal does 
or does not die; in explosives research, a weight may be dropped 
on a small sample of explosive and one may only note that there is 
or is not an explosion; and in psychophysical research, a sound of 
given pitch and intensity may be presented to a subject, and one 
only obtains the information that the sound is or is not heard. 
In order to have a terminology which will apply to all such situa- 
tions, let us say that the object under test either does or does not 
react to a stimulus of specified severity. The purpose in conduct- 
ing such experiments is to arrive at some description of the rela- 
tionship between reaction and severity of stimulus. 

Now the mere information that a single object subjected to a 
stimulus of specified severity did or did not react does not indi- 
cate very much about any relationship that may exist between 
reaction and severity of stimulus. However, suppose that one 
presents the same stimulus to a number of objects. Then the 
percentage of objects which react at this severity can be computed. 
From this consideration it is natural to assume that there exists a 
function, say f(x), which gives the percentage of objects, p, which 
will react at the severity z. In psychological experimentation 
this function has been called a psychometric function; in biology, 
it has been known as a dosage-mortality curve; while in explosives 
testing it is usually called a sensitivity curve. However, there is 
one distinction which should be made concerning its interpreta- 
tion in these three fields. In biology and explosives research, an 





* Presented to a Joint Meeting of the Eastern Psychological Association 
and the Institute of Mathematical Statistics, Atlantic City, April 26, 1947. 
143 











144 The Journal of Educational Psychology 


object can be used only once in the course of an experiment. 
Either it dies, explodes or is so changed that it cannot be used 
again. Consequently, the function which we have been dis- 
cussing describes the relationship of reaction to stimulus from 
object to object. On the other hand, in psychological research 
the object is usually a human observer, and the stimulus is 
such that it can be presented again and again to the same ob- 
server. There may or may not be a change in the observer from 
stimulus to stimulus and this constitutes one of the problems 
of such research. Therefore, the psychometric function essen- 
tially describes intra-individual variation in reaction to stimuli. 
Nevertheless, experimentation has shown that there is also inter- 
individual variation. In this paper, all remarks as they apply to 
psychological research will be concerned solely with the problem 
of measuring intra-individual variation with the restriction that 
there is no systematic change in the observer from stimulus to 
stimulus. 

The aim of an experiment on reaction to stimuli is to provide 
estimates of one or more numbers which will describe the way in 
which the percentage of reactions increases with the increasing 
severity of stimulus. The quantities which are frequently desired 
are: 

1) an estimate of the percentage reactions at fixed severity of 
stimulus, and 

2) an estimate of the severity at which a fixed percentage will 

react. 
Thus, if an observer is presented with sounds of fixed intensity 
but with varying pitch, the quantity usually desired is the pitch 
(level of severity) at which he hears the sound fifty per cent of the 
time; namely, the stimuluslimen. On the other hand, sensitivity 
research in connection with explosives is frequently concerned 
with problems of safety in handling, and so interest is centered 
more on extreme percentages, say the ten per cent or ninety per 
cent points. The difference in desired percentages in these two 
problems immediately prompts one to ask how the two problems 
of estimation given above depend upon the percentage. Certain 
general comments can be made on this subject as follows: 

1) When an estimate of the percentage reacting at a fixed 
severity is required, and when this percentage is neither very 














Estimating Reaction to Stimuli of Varying Severity 145 


small nor very large, then it is sufficient to make an adequate 
number of observations at this level. 

2) When the percentage reacting at a fixed severity is very 
small or very large, a prohibitive number of observations at this 
severity will usually be required in order to obtain a useful esti- 
mate of the percentage (e.g., one with the same relative error as 
in (1)). Under these circumstances it is desirable to make use of 
the assumed functional relationship which has already been men- 
tioned; e.g., the psychometric function. 

3) When an estimate of the severity at which a moderate per- 
centage will react is desired, the results will usually not depend 
too much on the assumed functional relationship between per- 
centage reactions and severity of stimulus. 

4) When an estimate of the severity at which a very large or 
very small percentage will react is desired, the results will depend 
very markedly on the assumed functional relationship. 

Since the preceding paragraphs indicate that an assumed func- 
tional relationship between percentage reacting and severity of 
stimuli, say p = f(x), is often of primary importance in our esti- 
mation problems, it seems desirable to:summarize briefly some 
of the functions that are being used in several different fields of 
experimentation. Perhaps the most common assumption is that 
the severity of stimulus can be measured on a scale for which the 
percentage reacting varies according to a cumulative normal dis- 
tribution. Thus in dosage-mortality it is assumed that the 
percentage killed varies according to a cumulative normal dis- 
tribution when plotted against log dose. For explosives research, 
when a weight is dropped on samples of explosives from varying 
heights, it is usually assumed that the percentage explosions 
varies according to a cumulative normal distribution when plotted 
against log height of drop, or against height of drop. Psycholo- 
gists are familiar with the assumption of a normal curve as 
expressed in the phi-gamma hypothesis or in the phi-log-gamma 
hypothesis. In his work on energy and vision Hecht! has made 
use of a cumulative Poisson distribution, while Stevens, Morgan 
and Volkmann? have postulated a cumulative rectangular dis- 
tribution to describe certain situations in the discrimination of 
loudness and pitch. 

However, in spite of these samples of various assumed distribu- 








146 The Journal of Educational Psychology 


tions, most of the statistical work on methods of testing and on 
analysis of the data has been confined to the normal assumption. 
For example, Guilford’ presents the methods of Urban for fitting 
a cumulative normal distribution to data taken by the Method of 
Constant Stimuli. Moreover, this same problem has been con- 
sidered by Bliss‘ and Irwin and Cheeseman‘ as it applies to dosage 
mortality work. The procedure they use has come to be known 
as the method of probits, and the theory involves the use of the 
method of maximum likelihood. Unfortunately, the computa- 
tional burden is extremely heavy in both of these instances. 

This last feature, excessive computation, was one of the factors 
which led explosives research organizations to seek other methods 
of experimentation and analysis which could be effectively used in 
testing the sensitivity of explosives. During the course of war 
research, the Statistical Research Group at Princeton University 
investigated the problem of providing appropriate statistical 
analysis for some of these methods, and also devised some new 
experimental techniques. This work has been treated in detail 
in an Applied Mathematics Panel Report® and in a Bureau of 
Ordnance Report.’ Itismy purpose here to present the principal 
details of several of these methods in the hope that they can be 
modified and adjusted in such a way as to provide a useful con- 
tribution in the field of psychophysical testing. 


II. GENERAL CONCEPT OF A ‘STAIRCASE METHOD’ 


As defined in NAVORD Report,’ the term ‘staircase method’ 
is applied to any method where the severity of stimulus applied 
to the next trial or group of trials is directly determined by the 
results of the last trial or group of trials. This means that there 
are certain general requirements which must be met before a 
staircase method can be considered appropriate for a particular 
situation. These are: 

1) Trials must be made one after another. 

2) The result of a trial must be immediately available. 

3) Changes in severity of stimulus must be easy to make. 
If these conditions are satisfied, then it may be possible to use a 
staircase method. At this juncture, it may be of interest to point 
out that the Method of Limits as used in psychophysical work is a 
staircase method by the above definition. As a matter of fact, 








Estimating Reaction to Stimuli of Varying Severity 147 


the NAVORD Report’ presents this method under the title of the 
Single Explosion Method. 


III. EFFICIENCY OF A METHOD 


Before giving some examples of staircase methods, there is one 
other aspect of the problem of choosing a method which is worthy 
of consideration; namely, the efficiency of the method. Thus, if 
one has to choose between two methods which provide estimates 
with the same,degree of error, but which require different numbers 
of observations, one will naturally pick that method which 
requires the smaller number of observations. This may be of 
vital importance if observations are difficult to obtain or if the 
problem of ‘practice’ on the part of the observer is a serious one. 
Similarly, if it is very costly to have a reaction, e.g., when an 
experimental animal dies, then one may wish to choose that 
method which requires the smallest number of reactions, subject 
of course to restrictions about the error of estimate. From these 
remarks one can see that the efficiency of a method will depend 
upon some combination of the following three quantities: 

1) Variance of the estimated quantity (e.g., severity of 
stimulus corresponding to a fixed percentage of reactions). 

2) Average number of observations needed to obtain a 
single estimate of the desired quantity by the method. 

3) Average number of reactions obtained in one such 
determination. 

In any practical situation it is necessary to have some ‘effi- 
ciency function’ which will provide a quantitative measure of the 
efficiency of any given test method. Thus in the NAVORD 
Report’ the efficiency of a test in obtaining accuracy from few 
trials is called ‘Accuracy per trial’ and is defined as 


~ 





1 
) 
mean square average number 
error of a xX | of trials 
single test per test 


where the mean square error of a single test is measured on the 
same scale for all methods being compared. Thus, if two tests 
provide estimates which have the same mean square errors, that 
test which uses the smaller number of trials per test, on the 
average, will have the greater accuracy per trial. 








148 The Journal of Educational Psychology 


When it is costly to have a reaction, there is rarely any fixed 
limitations on the number of trials and the natural criterion is 


Accuracy per reaction 
= 1/[(mean square error of a single test) 
< (average number of reactions per test) ] 


To explain and partly justify these criteria, consider the case of 
an agency which is willing to make one hundred trials on a specific 
object and which has to choose between 

Method A. mean square error of a single test estimate = 0.3 
average number of trials for a single test = 10 
and 
Method B. mean square error of a single test estimate = 0.5 
average number of trials for a single test = 5. 
If Method A is used, one hundred trials will allow about ten 
repetitions, and the mean square error of the result will be about 


0.3 
=—* 0.03. 
If Method B is used, one hundred trials will allow about twenty 


repetitions, and the mean square error of the result will be about 


0.5 

20 = 0.025. 
Clearly Method B would be chosen. The same decision would be 
reached by using the efficiency function since the respective accu- 
racies per trial are .33 and .40. Similar considerations apply 
concerning accuracy per reaction, and also concerning other effi- 
ciency functions which can be devised. 

Although the staircase methods are well adapted for meeting 
many of these requirements of efficiency, there is not space in the 
present paper to present this aspect of the problem in more detail. 
However, the interested reader is referred to the NAVORD 


Report.’ 


IV. THE SINGLE EXPLOSION METHOD (METHOD OF LIMITS) 


The Method of Limits is well known in psychophysical experi- 
mentation. Stimuli are presented to the observer in ascending or 
descending order of severity, a single stimulus being presented at 
each severity, until the observer reports reaction (in the ascending 








Estimating Reaction to Stimuli of Varying Severity 149 


series) or no reaction (in the descending series). The levels at 
which the stimuli are presented are usually spaced at equal inter- 
vals on the severity scale. The limen is taken as the midpoint 
between the last two stimulus values for any one series, and these 
estimated limen are then averaged over all series. This is 
similar to the procedure which has been called the Single Explo- 
sion Method in testing the sensitivity of explosives. Here a 
weight is first dropped on a sample of explosive from a height 
such that ng explosion will occur. Then the height of drop is 
increased by constant steps, a single drop being made on a new 
sample of explosive at each test height until an explosion occurs. 
The height of drop at which the first explosion occurs is taken as 
a measure of the sensitivity of the explosive. It is obvious that 
this method can also be used from the other direction, i.e., the 
first drop is at a height where an explosion is sure o occur and 
then the height of drop decreased by constant amounts until 
the first height is reached at which no explosion occurs. 

It follows from the preceding description that only one piece of 
information is available after completing a single trial on one of 
these methods; namely, the severity at which the first reaction (or 
non-reaction) occurs. One ordinarily desires to know the per- 
centage corresponding to the average value of this terminal 
severity, assuming the experiment is repeated many times, and to 
obtain some notion of the error of this estimate. It seems to have 
been implicitly assumed in the Method of Limits that the average 
value of this terminal stimulus less one-half the distance (on the 
severity scale) between successive stimuli corresponds to the 
fifty per cent point. This is not necessarily true, as will be 
shown shortly. However, the practice of averaging the results of 
equal numbers of ascending and descending series has undoubt- 
edly prevented any serious errors from being committed and has 
probably led to very good estimates of the limen. 

Let us see what progress can be made in an analytic way toward 
discovering the answers to these problems. Assume that stimuli 
are presented at severity levels xo, x1 = Xo + h, x2 = xo + 2h, 
etc. Then, if there is an underlying function, f(x), which gives 
the proportion of objects which will react at severity x, we can 
say that the probability of obtaining a reaction at level x is 
p, = f(x). Consequently, the probability of obtaining the first 
reaction at level x; = xo + i-h, when testing is started at level xo, 








150 The Journal of Educational Psychology 


is given by 
(1 — po)(l — pr) . . . (1 — ps) - D,. 

From these probabilities one can immediately compute the mean 
value of the severity at which the first reaction occurs and the 
variance of this level of first reaction. A conversion to the per- 
centage scale can then be made by using the function f(x). This 
process of computation involves knowing 

1) The function f(x), and 

2) The position of the test levels with respect to f(x). 

Some computations are given in the NAVORD Report’ based 
on the assumption that f(x) is given by a cumulative normal 
curve. The results of these computations are summarized in 
Figure 1. For these computations it has also been assumed that 
testing is started at a severity for which the probability of a 
reaction is almost zero, and that the distance between successive 
test levels (step size) is measured in terms of o, the standard 
deviation of the cumulative normal curve. The behavior of this 
curve is intuitively obvious since a sufficiently large step size 
means that the second trial will certainly be made at a severity 
corresponding to a percentage greater than fifty. Similarly, a 
small step size means that a large number of trials will be made at 
low probability levels, and consequently the first reaction can be 
expected at a severity corresponding to a relatively lower per- 
centage. A curve such as that shown in Figure 1 can be quickly 
computed for any relationship between severity of stimulus and 
percentage reactions, and this is one of the advantages of using a 
simple method such as this Single Explosion Design. In practice, 
one will seldom know the step size in o units even though the 
fundamental function is known or assumed. However, if the 
method is repeated a number of times, an estimate of o in terms 
of step size can be obtained. This is discussed more fully in the 
NAVORD report.’ Also, the effect of using one form of relation- 
ship when another one actually holds has been investigated by 
the Princeton Statistical Research Group, and the results will be 
available in the near future. 

In the event that the function f(x) is symmetric about the fifty 
per cent point, the severity levels estimated by ascending and 
descending series will also be symmetric about the severity level 
corresponding to the fifty per cent point and so an average of the 








Estimating Reaction to Stimuli of Varying Severity 151 


Percentage 














50% a 


ng vA 


16% L 








7% 


























2% 
0 22 ol 6 8 1.0 


Step Size in Units of o&” 


Fig. 1 PeRCENTAGE CORRESPONDING TO AVERAGE LEVEL OF FIRST REACTION IN 
SINGLE EXPLOSION METHOD. 


~ 


two results will estimate the fifty per cent point. An investiga- 
tion could be carried out very readily to show the effect of skew- 
ness on this estimate. Many variants of this single explosion 
method have been proposed in the NAVORD report,’ but it is 
impossible to describe these here. 

In this discussion, as well as in the ones which follow, no 
attempt has been made to take into account factors peculiar to 
psychophysical testing such as habituation and expectation. 
The introduction of these factors would require additional 
investigation. 








152 The Journal of Educational Psychology 


Vv. THE UP AND DOWN METHOD 


During the course of war research, an experimental procedure 
known as the Up and Down Method was developed at the Explo- 
sives Research Laboratory, Bruceton, Pennsylvania. ‘Trans- 
lated into our general terminology the procedure is as follows: 


1) A stimulus of severity xo is applied to an object. 

2) If a reaction is obtained, a stimulus of severity xo — h 
is applied to the next object (in psychophysical experi- 
mentation the same object; namely, the observer, is used 
throughout). If no reaction is obtained from the first 
application, the second application is made at severity 
Xo + h. 

3) In general, a trial will be made at a severity h less than 
that severity at which the previous object was tested if a 
reaction was obtained, and at a severity h more if no reac- 
tion was observed. 


In this manner one will obtain a sequence of reactions and non- 
reactions which may be recorded as in Figure 2. The crosses 
represent reactions and the o’s represent non-reactions. The 
process can be carried out until any desired number of observa- 
tions have been obtained. 

A statistical analysis for data derived from such experiments 
has been developed, and this analysis is presented in Applied 
Mathematics Panel Report No. 101.1R.° This analysis assumes 
that the underlying function, f(x), is given by a cumulative nor- 
mal curve, and develops maximum likelihood estimates of the 
parameters of these curves. Confidence limits are also given for 
these estimates, and recommendations are made concerning step 
size and number of observations. A discussion of the uses and 
relative merits of this method is contained in the NAVORD 
report.’ A statistical analysis of this method when applied to 
functions other than a cumulative normal curve is much more 
difficult than for the Single Explosion Method, and as yet little 
progress has been made in this direction. 


VI. THE SEQUENTIAL METHOD 


In the Single Explosion and Up and Down Methods no par- 
ticular attention is paid to what is accomplished at a fixed level 








Estimating Reaction to Stimuli of Varying Severity 153 


other than to make use of the fact that a reaction does or does not 
occur. A systematic, rather than empirical, approach to the 
problem can be obtained by focusing attention primarily upon 
the relation between the results of testing at a given level and the 
percentage point which is to be estimated. Clearly the level 
corresponding to the desired percentage point must either (1) 
be above the level at which the testing is taking place, (2) lie on 


Severity 








Xo+ 2h 


bad 





Xo+ h 


2) 
Cs) 








> 
© 

















1 2 3 4 5 6 7 8 9 
Trial Number 


Fic. 2 SEQUENCE OF REACTIONS (X) AND NON-REACTIONS (0) FROM AND UP AND 
DOWN TEST. 


the test level, or (3) lie below the test level. Now if a reasonable 
criterion can be obtained which will distinguish between these 
three possibilities upon the basis of trials made at the level, then 
testing at successive levels will give directly usable evidence con- 
cerning the desired percentage point. For suppose that at level 
x, the criterion indicates that the desired level is above level x. 
Then if testing is done at level x + h and the criterion indicates 
that the desired level is below this level, there is evidence that the 
desired level is between levels x andx +h. As in the preceding 








154 The Journal of Educational Psychology 


discussion, it is assumed that there is a fixed set of levels at which 
testing is to take place. 

There are certain rather obvious ways in which such a criterion 
can be arrived at. For example, one might simply carry out ten 
(or any other fixed number of) trials on a level and calculate the 
per cent reactions. If this experimental percentage were lower 
than the desired percentage, the next ten trials would be con- 
ducted on the next higher level. If it were higher, the next ten 
trials would be conducted on the next lower level. This proce- 
dure would then be continued until the first time that the results 
on one level indicated that the desired percentage point was 
above this level, and the results on the next higher level indicated 
that the desired percentage point was below this level. The 
actual level assigned to the desired percentage would then neces- 
sarily be taken between the two final levels. 

The simple method which we have been using illustrates the 
testing procedure. However, this particular one is inefficient for 
the task at hand because it requires a large number of trials to 
complete one determination of the desired percentage point. <A 
considerable saving can be made in this respect by using the 
Sequential Probability Ratio Sampling Plan of Wald.’ The 
application may be described as follows. It is desired to estimate 
the level at which the probability of a reaction is p. After cer- 
tain constants have been chosen (see’), reference to Wald’s paper 
enables one to compute two sequences of integers ui, Ue, Us, 

. and dy, do, ds, . . . , the subscripts referring to the accumu- 
lated number of trials on a particular level. 

Now suppose testing is being done on level x and that it is 
necessary to decide on the basis of trials whether the level corre- 
sponding to p is above x, or below x, or is nearly identical with x. 
As the testing is carried out on this level, a record is made of the 
trial number (n) and the total number of reactions which have 
been obtained in these n trials. After each trial the number of 
reactions is compared with the two sequences above. If, at any 
point in the testing, the number of reactions in n trials becomes 
equal to u,, testing is discontinued and the statement is made 
that the level corresponding to p is above x. On the other hand, 
if the number of reactions in n trials becomes equal to d,, testing 
is discontinued and the statement is made that the level corre- 








Estimating Reaction to Stimuli of Varying Severity 155 


sponding to pis below x. As long as neither of these decisions is 
obtained, testing is continued. 

If this procedure is applied to a level x as described above, a 
decision one way or the other will eventually be reached. The 
number of trials required to reach this decision will vary from 
test to test, and may, at times, become quite large. For this 
reason it is desirable in the present application to decide upon a 
maximum number of trials which are to be taken at any level. 
If this number of trials is performed on a level with no decision 
being reached, the statement will be made that the desired per- 
centage point lies on this level. This process is usually called 
truncation. 

The foregoing comments describe the general approach of the 
Sequential Method. There are, however, many details which 
need to be considered in actually designing such a procedure. 
Moreover, the computations required to describe the operation 
of this method, for an assumed functional relationship between 
severity and percentage reactions, are very extensive. A detailed 
treatment of these points will be found in NAVORD Report.’ 


VII. SUMMARY 


In this paper a brief description has been given of several meth- 
ods which are appropriate for use when an object subjected to a 
stimulus of specified severity either reacts or does not react. 
These methods, known as staircase methods, have been developed 
in testing the sensitivity of explosives. However, it is felt that 
with appropriate modifications they might be found of use in 
psychophysical experimentation and elsewhere. 


REFERENCES 


1) Hecht, Selig, Shloer, Simon and Pirenne, Maurice H.., 
“Energy at the Threshold of Vision,’’ Science, Vol. 93, No. 2425, 
June 20, 1941. 

2) Stevens, 8. S., Morgan, C. T., and Volkmann, J., ‘‘Theory 
of the Neural Quantum in the Discrimination of Loudness and 
Pitch,’’ American Journal of Psychology, Vol. 54, 1941, pp. 315- 
335. 

3) Guilford, J. P., Psychometric Methods, McGraw-Hill, 1936. 








156 The Journal of Educational Psychology 


4) Bliss, C. I., ‘“‘The Calculation of the Dosage Mortality 
Curve,” Annals of Applied Biology, 22, pp. 134-167. 

5) Irwin, J. O. and Cheeseman, ‘‘On the Maximum-Likelihood 
Method of Determining Dosage-Response Curves and Approxi- 
mations to the Median-Effective Dose, in Cases of a Quantal 
Response.”’ Supplement to the Journal of the Royal Statistical 
Society, Vol. v1, No. 2, 1939. 

6) ‘Statistical Analysis for a New Procedure in Sensitivity 
Experiments,’’ Applied Mathematics Panel Report No. 101.1R, 
July, 1944. 

7) “Staircase Methods of Sensitivity Testing,’ NAVORD 
Report 65-46, 21 March, 1946. 

8) Wald, Abraham, “Sequential Method of Sampling for 
Deciding between Two Courses of Action,” Journal of the 
American Statistical Association, Vol. 40, No. 231, September, 


1945. 








A NUMBER FACTOR 


FRANCES SWINEFORD* 
The University of Chicago 


This study is concerned with a bi-factor which was discovered 
quite by accident, which has now appeared in several factor 
studies, and which has piqued the curiosity of the writer for some 
time. It is deserving of a more thorough investigation than the 
one to be reported; here, we shall merely scratch the surface with 
rather crude instruments and make suggestions for later work. 

The factor is a number factor first identified in an analysis of 
twenty-eight tests given to four hundred fifty-seven ninth-grade 
pupils.' It was found common to these eight tests: Add, 
Counting Groups of Dots, The ‘3’ and ‘7’ Test, Number Recog- 
nition, Object-Number, Number-F igure, Numerical Puzzles, and 
Woody-McCall Mixed Fundamentals. The first three are speed 
tests wherein the problems are made extremely simple in order 
that the scores should be purest possible measures of mental 
speed. Certainly no manipulation of numbers is called for in 
the ‘3’ and ‘7’ test, where the subject is required only to encircle 
all the 3’s and 7’s among rows of digits. The next three tests 
are measures of rote memory—in none of them is any compu- 
tation done, nor are any time limits imposed. Simple arithmetic 
does appear in Numerical Puzzles, and the Woody-McCall test 
includes computation of a wide range of difficulty. There is no 
speed factor present in these tests because the timing is suffi- 
ciently generous to allow practically all pupils to finish. The 
only element which all these tests appear to have in common is 
‘numbers.’ The number-factor loadings were not notable for 
their size, particularly those for the ‘3’ and ‘7’ test and the 
memory tests, but for the fact that they exist at all. One addi- 
tional number test, Series Completion, does not exhibit a loading 
on this factor, but examination of the table of residuals reveals 
positive values with seven of the eight tests listed above, so that 





* Now with Educational Testing Service, Princeton, N. J. 

1 Frances Swineford and Karl J. Holzinger, A Study in Factor Analysis: The 
Reliability of Bi-factors and Their Relation to Other Measures, pp. 14 and 20. 
Supplementary Educational Monographs, No. 53. Chicago: University of 
Chicago Press, 1942. 


157 











158 The Journal of Educational Psychology 


it, too, doubtless has a low correlation with the number factor. 
No other test in the battery involves numbers; no other test in 
the battery has positive residuals with all of the nine that have 
been mentioned. 

Some of the factor weights for ‘number’ are so small that they 
might well have been attributable to chance and could have been 
so interpreted without additional evidence. Before proceeding 
to an inquiry into the nature of such a factor, therefore, it is 
necessary to establish its existence with greater certainty. 

First, the report of an earlier study was re-examined.' Here, 
analyses had been made for two groups of seventh- and eighth- 
grade pupils. At the time of the study no number factor was 
discovered for none was anticipated, and the distributions of the 
final residual correlations were such as to suggest that the analyses 
were adequate. The ‘3’ and ‘7’ test was not given to these 
groups, but the other eight number tests were included in the 
battery of twenty-four. For one group? the mean of the twenty- 
eight residuals for these tests is .0202, as contrasted with .0004 
for all two hundred seventy-six residuals. For the other group? 
the mean of the twenty-eight number-test residuals is .0348 as 
contrasted with .0007 for all two hundred seventy-six residuals. 
Now the means, .0202 and .0348, are minimum values, and if the 
number factor had been included in the pattern plan before the 
general-factor weights were calculated these mean values would 
have been markedly higher. 

For the purposes of the present paper the data yielding the 
above mean of .0348 have been re-analyzed to take into account 
the number factor. The common-factor portion of the pattern 
is given in Table 1. Three of the number-factor weights exceed 
4 and three are greater than .2. No loading for this factor was 
found for Test 15 or for Test 23 (Number Recognition and Series 


1 Karl J. Holzinger and Frances Swineford, A Study in Factor Analysis: The 
Stability of a Bi-factor Solution. Supplementary Educational Monographs, 
No. 48. Chicago: University of Chicago Press, 1939. 

2 Ibid., pp. 28-29. 

3 Jbid., pp. 34-35. 

4 Although the new analysis may be regarded as more precise than the 
original one, it would in no way alter the results of the earlier report, for 
the factor loadings of the five common factors exclusive of ‘number’ changed 
very little, the maximum shift being but .08. Thirty-four of the forty-three 
original loadings shifted less than .04. 














Completion). 


A Number Factor 


is readily apparent. 





159 


Nevertheless, the existence of a ‘number’ factor 


TABLE 1.—REVISION OF TABLE 10 or The Stability of a Bi-factor 











Solution 
Factor 
Test G a = 
en- ; Mem-} Num- 
; pe Spatial| Verbal| Speed ory |. bee 

1 Visual Perception......... .607 .374 

SG .o a4 bales Gass 44 58 .375 245 
25 Paper Form Board........| .428 .516 
RNs drei eS b east hasan .484 . 262 

5 General Information....... . 4 eee .552 

6 Paragraph Comprehension.| .595 |...... .545 

7 Sentence Completion...... Se ies .669 

8 Word Classification........ i ees .331 

9 Word Meaning........... ery .603 

RS as er ee .342 | — .292 .519 .600 
9 ee eee F «5 re .510 

12 Counting Groups of Dots..| .323 |...... .595 .495 
13 Straight and Curved Capi- 

ai bes ccbiewtweu sees , |, ee .481 

14 Word Recognition......... cc 7 eee . 380 

15 Number Recognition...... [ 3) .452 

16 Figure Recognition........ Te éeaes . 340 

17 Object-Number........... Ce .484 | .205 
18 Number-Figure........... . § Sere .295 | .291 
19 Figure-Word............. OD fei dh .157 
SARA .654 
21 Numerical Puzzles........ , gy See . 229 
22 Problem Reasoning........ .652 
23 Series Completion......... .737 
24 Woody-McCall........... EE Peers .402 























More recently another battery of nineteen tests has been 


administered to a group of ninth-grade pupils. The results for 


the two hundred thirty boys and the two hundred forty-one girls 








160 The Journal of Educational Psychology 


were analyzed separately. Among the tests are seven which 
make use of numbers; namely, Arithmetic, Series Completion, 
Number (in the Primary Mental Abilities battery), Counting 
Groups of Dots, The ‘3’ and ‘7’ Test, and two additional tests— 
Numbers and Paragraphs—designed for the purpose of investi- 
gating this factor from another angle. The latter two tests were 
both dictated, so that the subjects heard the numbers instead of 
seeing them. A typical item from Numbers might consist of the 
reading of a short series of digits followed by the instruction to 
write the second one. In Paragraphs a brief selection was first 
read; then questions were dictated which require numerical 
answers. ‘Two such paragraphs and a total of twenty-two ques- 
tions constitute the test. 

A number factor was postulated through the seven ‘number’ 
tests for each analysis. Curiously, each analysis revealed a sub- 
stantial loading with the Word-Fluency test of the PMA battery. 
Here, the task is to write in the allotted time as many words as 
possible starting with a given letter of the alphabet. Since no 
numbers enter into the test in any way it is difficult to under- 
stand this unexpected factor loading. The only explanation 
which occurs to the writer stems from the fact that the word- 
fluency test immediately follows the PMA number test. If the 
number factor results from a particular mental set peculiar to 
numbers—a thesis to be advanced in this report—then it is con- 
ceivable that such mental set may tend to perseverate through 
the next short test that is administered. This hypothesis has not 
yet been checked, but checking could readily be done. 

Another characteristic of the present analyses is the fact that 
the number factor is much more marked for the girls than for 
the boys. Moreover, the boys’ analysis has no loading at all for 
the series completion test, a result which agrees with the pattern 
of Table 1. This sex difference, likewise, should be checked 
with other samples. 

In Table 2 the factor loadings for the number factor are 
brought together for summary. It will be noted that in only 
three instances was it impossible to obtain a number-factor 
loading where one was expected. ‘Two of these instances occur 
for the series completion type of test. And only twice did a 
loading appear where it was not expected—in the word-fluency 
test already discussed. 











A Number Factor 161 


It is clear that a number factor which is not exclusively a com- 
puting ability factor does exist. The suggestion has been ven- 
tured that it results from a mental set in attacking the assigned 
task. That is to say, an individual who feels at home with 
numbers may enter upon the task with a different attitude from 


TABLE 2.—NvuMBER-Factor LOADINGS FROM Four ANALYSES 




















‘ Group 
Test 
457 145 241 230 

Cases? | Cases’ | Girls* | Boys‘ 
EE Ss .600 ¢ é 
Counting Groups of Dots.) .354 .495 .639 .372 
The ‘3’ and ‘7’ Test..... .168 d .376 154 
Number Recognition..... .302 ree é d 
Object-Number.......... 222 .205 d d 
Number-Figure..........| .151 .291 é d 
Numerical Puzzles.......| .495 . 229 ¢ d 
Series Completion........ ‘ as .200 
Woody-McCall..........| .477 .402 é d 
Avithmotic.............. . é .513 .276 
Number (PMA).......... ¢ é .661 .433 
NS Ce 0S os a é é .332 .208 
Pompmramhs .....6...0665. ¢ ¢ .162 .058 
Word-Fluency (PMA).... . ‘ .349 .3899 








* Swineford and Holzinger, The Reliability of Bi-factors, op. cit. 
*’ From Table 1. 

¢ Complete analyses not yet published. 

4 Test not given. 

¢ Probably a low positive value. [See text] 


that entertained by the individual who has somehow learned to 
fear or dislike numbers and who may feel inhibited when he sees 
them even before he knows what he will be asked to do. 

The problem of interpreting the number factor was partly 
met by the use of a scale designed to provide a measure of a 
pupil’s general attitude toward numbers. The scale consists 








162 The Journal of Educational Psychology 


of twenty statements such as, “I think I would like a job that 
requires figuring,’’ and ‘‘ Whenever I see numbers in a test ques- 
tion, I know I shall get the answer wrong.”’ The pupil indicates 
whether he agrees or disagrees with the statement or is not sure 
about it. Seven statements express a positive attitude toward 
numbers; seven, a negative attitude; and six, neutrality or indif- 
ference. The rating ‘score’ is the correlation of the pupil’s 
responses with the key, a positive correlation indicating a favor- 
able attitude. 

After the scale had been administered, each pupil was asked 
to write one or two statements of his own which might better 
indicate his reaction to numbers. Although the pupils were 
promised that no one connected with their school would ever see 
their responses to the scale or the statements they had written, 
the usual disadvantages of this technique probably are present. 
A few individuals professed, by the scale and by their own state- 
ments, an interest in numbers which is difficult to accept in the 
light of their test performance, whereas the reverse situation was 
not noted. On the whole, however, there is a low positive associ- 
ation of the scale with the number factor. 

The relationship between the scale and the factor was first 
determined by correlating the scale ‘scores’ with the estimates 
of the number factor. For the girls this correlation is .301, and 
for the boys it is .183. Since it is the girls’ group which appears 
to be the stronger in the number factor, the correlation of .301 
was selected for further verification by factor methods. A sub- 
set of ten tests was selected from the battery. The set includes 
six number tests of which two are also speed tests, one speed test 
without numbers, and three tests which measure the general 
factor but no additional factor common to others in the subset. 
Correlations between the scale ‘score’ and these ten tests were 
computed, and an analysis of the eleven variables was then 
made. ‘The result is given in Table 3 together with that portion 
of the original analysis which involves the same tests in order to 
demonstrate the stability of the solution. 

The scale has loadings on three factors—general, verbal, and 
number. The value of almost .3 for number supports the hypo- 
thesis that this factor is akin to attitude toward numbers as 
opposed to facility in handling them, and it agrees with the corre- 











163 


A Number Factor 

















99° rs 7 cee eee ee Ge ee ee eee eee eee apvog 
og: Joc foe gore: | zee: gpg [rcccre cece cece ewes ee ees s1oquINN' gT 
Zap: eg: ote ++ | gye- | ete: 1G0° [cco 480, 2, PUB .e, OUT ZT 
og: jot ete: |e: | ee zog’ [+ <“spendes paainy pus yuarRa3g 9] 

699° bP 160° | 689° | I16P aot sjoq] jo sdnoiy Suyunoy ¢T 
— 162° | 199° | 9¢T° la alittle as ites (VINd) eqUINN €T 
ep fees fee ggg: frvcc eee e eee ee ees (vWq) Suruosvey Ziq 

09: | e¢9: ope | gag foe e eee ec eee Suruveyy Pom 9 

— ees | gage [evesecccceveeecsceeeees uorjonpaq & 

ZS ez) | 00g" tee eee eee uonaldui0r) soueg Z 

Blk 
pee poedg | [eqio, pam 2.3 posdg | [eqiaA jens 














| 








SIQVLIVA [| Jo sisAjvuy 





U19}IVY [VUISIIG Jo uolzROg 





189], 





AIVOG dO NOILISOUNOD TIVIUOLOV,] MOHG OL STUID [PZ UO SNUALLVG UOLOVYA OMI—'E AIAV], 











The Journal of Educational Psychology 


164 
































































































































| 
OF — LI 6 81 2 S18) Sh ST 02 61 IZ | SSP 
8g" — 02 GZ 81 ct |" "egsi OL" SG $2 $2 € | L&2 
09° — LI 1Z SI IZ |’ TSh 29° Ke €Z $2 02 |'"' S6I 
00° 02 cZ 61 LT |'"" 928) 69° 1Z LT ad Zo CO OLT 
i ad LI 91 02 |" SST} TL’ CZ 61 €Z Zz CO 880 
sAog 
i - tI €Z ad cI 139) $8" 91 0 IZ € | PGP 
02° — 9% 81 61 91 "09% SL’ IZ G 1Z 02 |’ 08% 
ce — 81 GZ 81 ZI OFF FB" $2 €1 ¥Z oe |" '6sI 
sI*- ad tI LI 91 |" T98] 09° 4 tI 91 € | 831 
eI'- ST LT 61 cI |" $22) 0S" ST 81 LI 6I |" LUT 
sti 
10408 103408 10308 103408 7 10108 10308 10408 10308 = 
19S ae aoa see Io aa ae re ie, Hit Id ae se 
A |I D |7°dON | ose5 [eryedg | [eqio, | | D |2qUN | ose 
oysid a Ua | 
SHaaW1 


GUuVMOL SAGOLILLY ANAULXY GASSAuUdXD] OHA, STIdOd dO SHUODG ATVOG ANV SALVWNILSY UOLOV —'p AIAV J, 





PLC ORT ean 


} 





A Number Factor 165 


lation with the factor estimates. The correlation of about .2 
with the general factor is a reasonable one; dull individuals are 
less likely to enjoy academic activities than are bright individuals. 

A negative overlap appears with Test 6, a verbal test. It is 
not certain whether this overlap is one which is common to these 
two variables only or whether it actually represents the verbal 
factor, as indicated in the table. The latter case was assumed 
because, by assigning to Test 6 the verbal-factor loading which it 
had in the original analysis (rounded), a value of —.25 is obtained 
for the scale, and this value agrees with a correlation of —.245 
found between the scale and verbal-factor estimates. 

Some of the statements which the pupils wrote are revealing. 
Many are noncommittal, but a number express extreme liking or 
disliking for numbers. Five from each extreme by each sex are 
reproduced here, and in Table 4 the corresponding factor esti- 
mates and scale ‘scores’ are presented. 


Statements by Girls 


127) I like math because I 
like to do things with figures. 

128) I like math because it 
is fun and I like to figure 
problems. 

129) I liked arithmetic since 
I first new it good. I really 
like to work out it’s puzzles. 

420) I like math because you 
have to put your brain to work 
in order to get it and if you can 
get your brain to work you will 
like it much better than if you 
don’t try to get it, and also 
because I like adding, and 
dividing numbers. 

424) I like Math very, very 
much. In Grammar School it 
was my favorite subject and I 
always got good marks in it. 


224) Arithmetic is a subject 
I don’t like of any kind. I 
like the other subjects much 
better. I have allways failed 
in any test. 

361) I do not like arithmetic 
because I find it difficult to 
understand. 

440) I don’t like math prob- 
lems because usually I find 
them hard to do. 

460) I never have liked 
math. because it was always 
very hard for me. 

621) I know that math is 
essential but, I still don’t care 
for it. Some math problems 
are easy for me to work but I 
still don’t care for math. 





166 





The Journal of Educational Psychology 


Statements by Boys 


058) Arithmetic was my fav- 
orite subject in grade school. 
I like it very much. 

170) I like math because I 
usealy got a good grade and 
got along with the teacher so 
I like to do the work in math 

193) The reason I like math 
is because I get a kind of thrill 
out of getting the right answer 
in hard problems. Another 
reason I like math is because 
there is a set right and wrong 
answer and no half right and 
half wrong answers. 

257) I like to do problems 
that require mental work and 
work on paper. I like it for 
no special reason except that 
I like it. 

458) I like math and I think 
it’s easy. 


158) I don’t lick arithmetic 
and I never did but I try just 
as hard as I can to get along with 
it. It is hard for me. 

276) I’d like to be an air- 
plane poilt but you haft to 
know your math real good. 
Although I don’t like it I do 
try and learn 

451) I never done any good 
and I don’t think I ever will 
Thats why I don’t like it. 

553) I don’t like arithmetic 
because I can’t consontrate on 
numbers and I don’t like to 
do it. 

813) I don’t like math at all. 


If I could understand or do it 
better I might like it, but as it 
stands: No Sir. 








The factor estimates have means of 20 and standard devi- 
ations of 3 for each of the two groups. The girls’ scores may 
not be directly comparable with the boys’ scores. The scale 
values, on the other hand, are comparable for both groups. 
Since these few cases were selected at random from those express- 
ing unqualified like or dislike, the ¢ test is appropriate for check- 
ing differences between the ‘like’ and ‘dislike’ groups with 
respect to the factor estimates and scale ‘scores.’ 

For both sexes the ‘like’ groups excel in the number, general, 
and spatial factor scores and in the scale ‘score’; likewise, in each 
instance the ‘dislike’ group has the higher verbal-factor score. 
For the girls, the values of ¢ for the number factor and the scale 
are 5.96 and 5.90, and the remaining ?¢’s are under 1.00. For the 
boys the significant differences are those for the general factor 
and the scale, with ¢’s of 4.39 and 8.50. The number-factor 








A Number Factor 167 


difference approaches significance at the five per cent level with 
t = 1.86; the other ¢’s are —1.16 and 1.62. 

Thus even these very small samples are consistent with results 
already set down. The number factor is clearly associated with 
attitude toward numbers in the case of the girls. With the boys 
this relationship is less marked, but their ‘likes’ and ‘dislikes’ 
appear to be related to their general factor to a significant degree. 
This is an interesting sex difference and worthy of more detailed 
study. ‘ 

CONCLUSIONS 


Evidence has been set forth that there exists a number factor 
common to tests which contain numbers. It has been suggested 
that this factor is allied with a pupil’s mental set as he approaches 
a task, and that in this case the mental set is determined by his 
general liking for numbers. 

At least two of the hypotheses require further checking in 
order to be deserving of the name. There is a hint that a mental 
set determined by attitude toward numbers may tend to perse- 
verate long enough to affect performance with unrelated material. 
There is a suggestion that girls may be more sensitive than boys 
to their general feeling for numbers. 

The scale was used for the first time in this study. We are 
now in a position to revise and extend it to develop a more valid 
instrument. Personal interviews also might well be employed to 
gain information not revealed by the scale. Since the number 
factor as here measured may be regarded as independent of the 
general intellectual factor, it would be most interesting to delve 
further into the problem of when these individual differences 
appear and how they are developed. 








THE VALIDITY OF UNIVERSITY COUNSELOR 
SELF-RATINGS 


A. J. DRUCKER and H. H. REMMERS 
Purdue University 


Results of studies on self-ratings or self-judgments on various 
ability and personality traits have shown that quite consistently 
a bias is introduced in favor of the typical rater when the criterion 
is ratings by others or some objective measure. Hollingsworth? 
found a tendency for individuals to over-estimate their possession 
of desirable traits and to under-estimate their undesirable ones. 
Jackson’ reported that in general the greater the degree of conceit 
a person has (in the opinion of others) the poorer is his ability to 
judge himself. Hoffman! found that individuals tend to show 
consistency in their degree of over-estimation of the desirable and 
under-estimation of the undesirable traits. 

In the Spring of 1948 a questionnaire survey of instructional 
staff personnel was undertaken to obtain an overview of counsel- 
ing facilities, practices and ability at Purdue University. One 
item requested an evaluation of counseling activities as follows: 


How do you rate the quality of your counseling activities compared 
with the University as a whole? Please check in appropriate column. 











Excel- 
lent— | Better Less ‘ 
Aver- No 
pro- than than | Poor ss 
. age Opinion 
fessional | average average 
quality 





On educational 
problems 

On vocational 
problems 

On emotional, 
‘*‘personal’’ 
problems 























_ Three other items in the survey were designed to obtain infor- 
mation of a direct evaluative nature with respect to counseling 


activities: 


168 





University Counselor Self-ratings 





169 


How do you rate the over-all quality of counseling in the University? 
Please check in the appropriate column. 





Excel- 
lent— | Better Less M 
Aver- No 
pro- than than | Poor 5 
. age Opinion 
fessional | average average 
quality 





On educational 
problems 

On vocational 
problems 

On emotional 
‘‘nersonal’’ 
problems 























Describe briefly any training that you have had in courses related to 
counseling, such as courses in psychology, guidance and counseling, etc. 

Describe briefly your previous experience in, or related to, counseling, 
if any, before September, 1947. 


In addition, those respondents who wished to take a test on 
counseling were given two forms of the test, How I Counsel,* 
designed to measure attributes of counselors. Hence a maximum 
of five counseling measures, here designated A through E for 
convenience, was possible for each subject: A—his self-rating on 
a five-point scale of his own counseling activities; B—his rating 
of University counseling as a whole on a five-point scale; C—his 
combined raw scores on two forms of How I Counsel; D—rating 
of his counseling training by others on a five-point scale; E— 
rating of his counseling experience by others on a five-point scale. 
Results obtained on these items in the original survey warranted 
the further analysis of the data described in the current study. 

Analysis of the self-ratings (A) in this study was made on 
ninety-five respondents on whom training and experience ratings 
(D, E) were also available. Scores on How I Counsel were avail- 
able, and the ratings of counseling training and experience were 
correlated first with the self-ratings (A) and then with the ratings 





* For detailed discussion of validation and reliability see Benz, Stanley R.., 
“An investigation of the attributes and techniques of high-school coun- 
selors.”’ Studies in Attitudes, Series XII, Studies in Higher Education, 


Lx1v, Division of Educational Reference, Purdue University, October, 1948. 








170 The Journal of Educational Psychology 


of the same individuals of University counseling activities as a 
whole (B). Individual self-ratings were then compared with 
ratings of University counseling as a whole to determine the 
extent and direction of any differences. 

There is a possibility that a wide variation of meanings was 
placed upon the term ‘educational counseling’ in the original 
survey. Although respondents were instructed in the question- 
naire that educational counseling was to refer to giving aid on 
such problems as course difficulty or choice of curriculum, there 
was a suspicion that many respondents included as incidents of 
educational counseling academic discussions of the course work 
or the special tutoring of slow students. To what extent more 
efficient study or learning methods or choice of curricula in 
terms of an individual’s interests and capacities were discussed 
is not known. 

RESULTS AND DISCUSSION 


Correlation coefficients obtained, corrected fer attenuation, * 
are shown in Table 1. If the assumptions underlying these cor- 
rections are justified, there appears to be a tendency for self- 
ratings on counseling to accompany at least knowledge of 
acceptable counseling techniques (the apparent content of How 
I Counsel test items) insofar as counseling means giving help to 
students on their vocational and personal problems (r’s of .52 
and .56, respectively). Fairly strong relationships were likewise 
found between self-ratings on both vocational and personal coun- 
seling as correlated with both training (r’s of .58 and .57, respec- 
tively) and experience (r’s of .54 and .59, respectively) ratings. 
There is a slight tendency, not statistically significant, for those 
making better scores on How I Counsel and those with higher 
training and experience counseling ratings to be more critical of 
University counseling as a whole. It will be noted, however, 
that correlation coefficients involving rating of University coun- 





* Reliability coefficients used in correcting correlation coefficients for 
attenuation were computed or assumed as reasonable to be the following: 
“How I Counsel” .85 + .02 by equivalent forms and Spearman-Brown 

formula, N = 131; 
Training Ratings .90 + .02 by correlating 138 independent ratings; 
Experience Ratings .90 + .02 by correlating 134 independent ratings; 
Self-ratings .80 + .05 assumed for N = 60; 
University ratings .60 + .08 assumed for N = 60. 








University Counselor Self-ratings 171 


seling of any type, while low, are all negative. Conditions neces- 
sary for computing increased reliability from replication of 
experiments are not satisfied here,® but it can be argued that 
the significance of these r’s as a group is higher than that com- 
puted for any single r. 


TABLE 1.—SEuLF-RATING AND UNIVERSITY RATING 
INTERCORRELATIONS 





? 


Self-Ratings University Ratings 





Educ.| Voc. | Pers. | Edue.| Voe. | Pers. 





N 73 59 52 68 67 56 
“Tow I Coun-|r .08 .527| .56T)/— .38*;/—.29 | —.11 
sel’”’ o, |+ .13Ti+ .12 |+ .13 |+.15 |+.16 | +.18 


N 85 73 64 75 73 62 
Training Ratings} r .27*| .587| .57T/— .37T|/— .37*| —.14 
12 .10 11 .14 15 18 


a 
I+ 
I+ 
I+ 
+ 
I+ 
I+ 


N 82 71 61 72 70 59 
Experience Rat- | r A427, .54f) .597|/—.29*%1—.23 | —.15 
ings ge (+ .11 [jt 1 it. 11 [+ .16 it .16 1 + .18 


























* Significantly different from zero at the 95 per cent confidence level. 

t Significantly different from zero at the 99 per cent confidence level. 

t Kelley’s Formula 161 was employed for finding standard errors of these 
coefficients of correlation corrected for attenuation.‘ 


Mean University ratings of the better counselors, in terms of 
above-median scores on How I Counsel, were found to be about 
half a scale point below those of the poorer counselors (below- 
median scores) for educational and vocational counseling. Dif- 
ferences were significant at about the 95 per cent confidence level. 
Table 2 shows the distributions for good and poor counselors for 
each type of counseling rating. 

Little or no relationship is indicated between self-rating on 
educational counseling and scores on How I Counsel. In addition 
to the wide variation of meanings possibly placed on educational 








172 The Journal of Educational Psychology 


counseling by respondents, it can be hypothesized that low ratings 
on this ability would be thought to discredit one’s ability as an 
educator. Moreover, it is quite possible that more deliberately 
frank self-ratings were offered on vocational and personal counsel- 
ing activities, which in many cases represented duties connected 


TABLE 2.—DIsTRIBUTIONS OF RATINGS OF UNIVERSITY COUNSEL- 
ING BY GooD AND Poor CouNSELORS DICHOTOMIZED ON BasIs 
oF How I Counsel Scores 


























Educational | Vocational Personal 
; Counseling | Counseling | Counseling 
Rating 
Good | Poor | Good | Poor | Good | Poor 
5 3 6 1 6 1 2 
4 8 16 9 13 9 9 
3 15 7 14 5 7 7 
2 5 3 7 3 6 4 
1 3 2 4 5 5 6 
N 34 34 35 32 28 28 
Mean 3.09 | 3.62 | 2.89 | 3.38 | 2.82 | 2.89 
SD 1.01 | 1.06 98 | 1.29 | 1.17 | 1.26 
Critical Ratios 2.13 1.73 21 








with the teaching situation for which little time and no extra 
remuneration was provided and for which a good many instruc- 
tors might feel they had no responsibility. Extent of accuracy 
or validity of self-ratings of counseling activities may be related 


TABLE 3.—AVERAGE DIFFERENCES BETWEEN COUNSELING SELF- 
RATINGS AND RATINGS OF UNIVERSITY COUNSELING 


Type of Counseling N Differences (in favor of 
self-rater) 
Educational 61 43 + .12 
Vocational 51 24 + .12 
Personal 46 48 + .15 








University Counselor Self-ratings 173 


to the counselors’ need for maintaining occupational status and 
prestige. 

In Table 3 are shown the average differences between self- 
ratings on the one hand, and counselors’ own ratings of Uni- 
versity counseling activities, on the other. Counselors tended 
to rate themselves higher than they rated the University counsel- 
ing as a whole on the three types of counseling. The differences 
are reliable ones. 

SUMMARY AND CONCLUSIONS 


1) Self-ratings on the counseling Purdue University staff mem- 
bers do on vocational or personal problems are fairly valid, since 
they correlate above .50 with scores on a test measuring attributes 
of a counselor and with ratings made by others on training and 
experience in counseling. 

2) The better counselors of Purdue University (in terms of 
scores on How I Counsel) on the average are more critical of edu- 
cational and vocational counseling done by the University as a 
whole than are those with lower scores. 

3) Counselors, in rating their own counseling, tend in general 
to place themselves slightly but reliably higher than their own 
estimate of the University counseling as a whole. 

4) Counselors on the average rate University educational coun- 
seling as slightly better than average and University vocational 
and personal counseling as slightly lower than average. 


REFERENCES 


1) G. J. Hoffman, ‘‘ An experiment in self-estimation.”’ J. ab- 
norm. soc. Psychol., 1923, 18, 43-49. 

2) H. J. Hollingsworth, Vocational Psychology. New York: 
D. Appleton Co., 1916, pp 143-173. : 

3) T. A. Jackson, “Errors in self-judgment.” J. appl. Psy- 
chol., 1929, 13, 372-377. 

4) T. L. Kelley, Statistical Method. New York: The Mac- 
millan Company, 1923. 

5) C. C. Peters and W. R. Van Voorhis, Statistical Procedures 
and Their Mathematical Bases. New York: McGraw-Hill Book 


Company, 1940, pp 473. 





A SCIENTIFIC APPRAISAL OF PROFESSIONAL 
EDUCATION FOR BUSINESS 


DONALD K. BECKLEY 


Simmons College 


One of the basic problems in many areas of professional educa- 
tion is obtaining an objective appraisal of the effectiveness of the 
formal training provided in relation to the educational advan- 
tages of regular employment situations. An area of professional 
education in which such an appraisal is especially needed is train- 
ing at the college and graduate level for executive positions in 
retailing. This type of education is today offered by many col- 
leges and universities, and much reliance is placed on these 
professionally trained people as potential retail executives. 

Widespread as this training has become, however, little has 
thus far been done in attempting to measure its effectiveness in 
any scientific manner. There is to date no valid evidence to 
indicate to what extent or in what areas those who have completed 
a program of formal training are better prepared for useful busi- 
ness careers than those who receive all of their training in actual 
job situations. In an educational area such as this in which 
comparisons are regularly being made between those with training 
and those who have instead entered full-time employment, it is 
particularly important that some clear distinction be made as to 
the areas in which formal training can be most effective. 

This article has been prepared to describe a study carried out 
during the past two years to ascertain the extent to which gradu- 
ates of retailing programs are achieving the objectives now 
regarded as desirable. To do this, the performance of retailing 
graduates was measured in relation to the objectives of their 
program, and compared with the performance of other groups 
who had not received formal training in retailing, and with those 
who had received formal training as well as regular work experi- 
ence. This study was designed to provide a measure of the 
effects of formal training as distinct from work experience, and 
can serve as a guide to future curriculum changes in this field of 
professional education. In brief, it was the purpose of the study 


to apply scientific methods of curriculum analysis to college 
174 





= 
; 
& 
E 
- 








Appraisal of Professional Education for Business 175 


training for executive positions in retailing in order to ascertain 
the current effectiveness of training in this area, and to isolate the 
kinds of behaviors needed in which formal training has proved 
most effective. 

The method of study used here is not without precedent in 
recent measurement studies. The use of evaluation techniques to 
measure the relative progress of a group of students given certain 
kinds of educational experiences as compared with a control 
group was carried out in the Kight-Year Study of the Progressive 
Education Association. A somewhat different use of evaluation 
in measuring changes brought about through education was a 
study made by questionnaire to ascertain the differences between 
graduates and non-graduates of the University of Minnesota. 
Another study! more directly comparable to that undertaken 
here was made of the extent to which students of foods and nutri- 
tion courses demonstrated greater competence than students 
without this training in respect to certain objectives such as appli- 
cation of principles and development of interests. 

The study described here is concerned with training of students 
at the upper college and graduate school levels for executive posi- 
tions in department stores. ‘The cases used for investigation were 
entering and graduating students in the Simmons College Prince 
School of Retailing and workers in executive and junior executive 
positions in Boston department and specialty stores. The pro- 
cedures followed in the study were as follows: 

A) The determination of the objectives to be measured. 

B) The construction of a comprehensive examination in retail- 
ing to measure competence in the objectives found above. 

C) The administration of the examination to these four groups: 
(1) Incoming students at the Simmons College Prince School of 
Retailing who had not studied retailing in formal courses nor had 
extensive work experience. (2) Students who had completed the 
course in retailing at the School, but had not yet had extensive 
work experience. (3) Executives and junior executives in Bos- 
ton stores who were in positions of the kind graduates soon would 
be taking, but who had had no formal retail training. (4) Store 
executives and junior executives who had had a specified amount 
of store experience and also were graduates of the School. 

D) Interpretation of results. 








176 The Journal of Educational Psychology 


DETERMINING OBJECTIVES FOR CONSIDERATION 


Before any estimate could be made of the extent to which stu- 
dents of retailing are aided by formal training in becoming more 
competent store executives, it is necessary to determine the 
objectives to be sought. Research in the field of executive train- 
ing for retailing has not yet advanced to the level at which the 
literature includes carefully considered suggestions on curriculum 
development at the college level. Hence, it is necessary to seek 
authoritative sources of opinion on the kinds of changes that 
should be brought about in college-level students who are to be 
trained for executive positions. In actual practice, objectives for 
retailing are based most commonly on job analysis, which pre- 
sents the tasks to be learned primarily in terms of how they are 
currently being carried on. The broader view regarded here as 
far more desirable seeks to consider training for retailing not 
merely in terms of perpetuating the status quo, but through care- 
ful consideration of the needs of the individual consumer, the 
national economy, and the needs of the individual student. 
Retail training touches each of these aspects of our society, and a 
comprehensive consideration of the objectives which should be 
covered must recognize its broad scope. 

The objectives regarded as desirable were selected as a result 
of a detailed analysis of the various viewpoints which together 
comprise the broad area covered by retailing and make possible a 
unified view of the task executive training should seek to accom- 
plish. These viewpoints are: (1) the social economist who envi- 
sions retailing as a useful factor in our national economy, (2) 
the consumer as expressed through the agencies set up to protect 
his interest, (3) the retail executive who sees the requirements of 
the field, (4) the critic of our retailing system both inside and 
outside the store, (5) schools of retailing set up to provide the 
training with which this study is concerned, and (6) the prospec- 
tive student of retailing who states his reasons for wishing formal 
training in this field. 

Through this analysis, thirty-six objectives were found to repre- 
sent the varied points of view indicated above. These were then 
appraised in terms of their philosophic and psychological con- 
siderations in order to formulate some concepts of instruction in 
retailing which might be expressed as a theory. One of the pur- 








Appraisal of Professional Education for Business 177 


poses of the study would then be to test the soundness of this 
theory. 

What, then, are the sorts of objectives which college programs 
can teach more effectively than work itself can provide? It is 
proposed that the first area in which college programs can operate 
more effectively is in the teaching of principles which apply to 
retail operation and to outside conditions which affect retailing. 
Formal training can through the impartial nature of its inquiry 
examine these principles from various viewpoints, and sub- 
stantiate their existence through observations of cases in a variety 
of situations. This differs from the position of the store employee, 
who more commonly observes practices rather than compre- 
hends principles, and is able to see the relative desirability of 
procedures only in respect to their success or failure within one 
organization and in a given situation. 

The second area in which college programs can be more effec- 
tive is that of attitudes and of sensitivity to the social possibilities 
of the critical position retailing assumes economically as a large 
part of the link between production and consumption. College 
training can be most effective in teaching this kind of behavior 
because an educational institution, unlike a business organiza- 
tion, is conducted on a non-profit basis, and as such should be 
able to visualize various objectives for a retail store in addition to 
profitable operation. It is suggested, therefore, that the entire 
nature of the aims and surroundings of an educational institution 
can properly lend themselves to an unbiased appraisal of the social 
function of business. 

The third area in which college programs should be more effec- 
tive is in teaching all desired kinds of background knowledge that 
go beyond the scope of any one business institution. 

On the other hand, work experience itself would seem able to 
serve more effectively than the college in teaching specific job 
techniques. An example at an elementary level is the making out 
of saleschecks, and at a higher level of difficulty it would include 
the development of skill in writing effective advertising copy or 
selecting merchandise that will appeal to a particular class of 
customer. Tasks of this nature have principles underlying them 
which can be taught in school, yet the practical application of 
these principles requires experience in actual situations, and only 
work itself adequately can provide this experience. 








178 The Journal of Educational Psychology 


In summary, then, the theory was proposed that formal train- 
ing can be effective in teaching those objectives that: (1) derive 
from the concept that retailing is one phase of our distribution 
system and has broad implications in our social and economic life, 
and (2) concern general principles and knowledge that go beyond 
the scope of any one business institution. Retail work experi- 
ence, on the other hand, is best suited to those objectives which 
concern specific techniques and methods of performance in which 
it is adequate, or perhaps preferable, to know well one system 
rather than to have a broader but less detailed grasp of the overall 
situation. The extent to which this theory was found to be 
supported will be described in later paragraphs. 

The objectives determined through the screening process to be 
most desirable in terms of behavior that are currently being 
taught at the Simmons College Prince School of Retailing are as 
follows: 

1) The development of effective methods of thinking in respect 
to: (a) application of principles of intelligent business manage- 
ment in retail store operation. (b) interpretation of data relat- 
ing to consumer needs and distribution costs. 

2) The acquisition of important information in respect to: (a) 
identification of relationships between items of information or 
facts relating to the retailer and to the consumer. (b) compre- 
hension of the nature of distribution and of the nature and effect 
of governmental intervention in retailing. 

3) The cultivation of skill in the use of mathematics in retailing 


situations. 
THE CONSTRUCTION OF AN EXAMINATION 


The objectives listed above were analyzed in terms of content 
and the situations in which learning could be most effective, and 
an examination in retailing was constructed for use in determining 
the relative effectiveness of each of the four groups being com- 
pared in respect to each of these objectives. The three types of 
behavior included were tested in the following manner: 

1) Development of effective methods of thinking. Sets of 
data representing the selected content areas were presented, fol- 
lowed by a number of statements which purport to be inter- 
pretations. The student was asked to indicate her judgment of 
each of the statements by classifying it as to its relationship to the 


data. 








Appraisal of Professional Education for Business 179 


2) Acquisition of important information. Exercises of the 
matching, true-false, and multiple-choice varieties were used to 
measure identification of relationships between items of informa- 
tion and comprehension of meanings. 

3) Cultivation of skill in the use of mathematics in retailing 
situations. Exercises were set up for determining one mathe- 
matical component when the other related figures were known, 
and for the identification of the correct statement of retailing 
equations. 

The validity of this examination was checked through a process 
of critical analysis by members of the faculty of the School in 
charge of the courses in which the desired objectives were taught. 
The reliability of the examination was measured through use of 
the Kuder-Richardson formula number 20,2 which measures 
what has been described as the coefficient of equivalence. 

The examination was administered between May 13 and Sep- 
tember 15, 1947, to subjects in each of the four groups on two 
levels: (1) those with four years of liberal arts college background, 
and (2) those with two years of liberal arts college background, 
to correspond to the two retailing programs at the School. In 
addition, an intelligence test (Wonderlic Personnel Test, Form 
F) was given so that differences between groups in respect to the 
intelligence level of the subjects could be controlled. 


THE DESIGN OF THE EXPERIMENT 


The problem of this study for which a statistical solution was 
sought was to discover whether or not differences in gains among 
groups of students in respect to the objectives regarded as desir- 
able were greater than would be expected from the operation of 
chance factors alone. The purpose of the experiment, then, was 
to determine the strength of these three factors: (1) formal retail 
training, (2) retail work experience, and (3) undergraduate col- 
lege education, in respect to achievement of selected retailing 
objectives. The hypothesis applied here was that groups com- 
parable in all respects but differing in their treatment should 
reflect differences in achievement that were the result of that 
particular treatment. The differences among groups were 
observed in respect to each of the objectives, divided into sub- 
tests as follows: (1) application of principles of retail manage- 
ment, (II) interpretation of data relating to consumer needs, 








180 The Journal of Educational Psychology 


(III) identification of retailing facts, (IV) comprehension of the 
nature of distribution, and (V) cultivation of skill in the use of 
retailing mathematics. 

The nature of the experiment carried out can best be indicated 
by arranging the data in the following design: 





No Training Training 





2 Yrs. Col./4 Yrs. Col./2 Yrs. Col./4 Yrs. Col 





No experience N 36 N 30 N 29 N 28 
Experience N 29 N 32 N 12 N 10 

















This plan makes possible the separate identification of the 
three factors to be studied as follows: group 1, no training, no 
experience; group 2, training, no experience; group 3, experience, 
no training; and group 4, both training and experience. The 
group numbers refer to the categories listed earlier under adminis- 
tration of the examination. In working out the design, the fol- 
lowing comparisons were made: (1) ‘no experience’ group with 
‘experience’ group, (2) ‘no training’ group with ‘training’ group, 
and (3) ‘two years of college’ group with ‘four years of college’ 
group. 

A number of possible statistical techniques were considered in 
planning this study. One treatment often used in investigations 
such as this is the matching of pairs. Although it would be 
possible to match pairs of cases within each pair of groups in the 
present experiment, the disproportionate number of cases would 
mean that many cases in the larger groups would be wasted 
after pairings were made. Another procedure that is more 
appropriate in this situation involves the matching of experi- 
mental and control groups through use of a regression technique 
that does not require pair-by-pair matching. This treatment, as 
described by Peters,* can make it possible to know whether the 
three experimental groups do better on the achievement examina- 
tion in retailing than would be expected in view of their intelli- 
gence test scores. The hypothesis to be tested here is that there 
are no real differences produced by the factors introduced, and 








Appraisal of Professional Education for Business 181 


that any differences in final mean scores, after allowances have 
been made for chance differences in initial mean scores, are due 
entirely to chance fluctuations in random sampling. 

While similar in many respects to Fisher’s covariance tech- 
nique,‘ the Peters’ technique makes the regression equation from 
the statistics of the control group rather than from the experi- 
mental and control groups pooled, on the ground that a pooled 
estimate would be a meaningless hybrid if the two groups dif- 
fered by reason of the experimental factor, as probably would be 
the case. In this experiment, a regression equation was calcu- 
lated from the scores obtained on the intelligence test and on the 
retailing examination by the control group—which had neither 
retail training nor work experience. The equation was then used 
to predict the three other groups, which were regarded as the 
experimental groups. This predicted score was then compared 
with the actual score of each case in the experimental groups, and 
the significance of the difference between means of predicted and 
actual scores tested for each objective in each subgroup separately. 
In this type of treatment, if the mean of the obtained scores is 
significantly greater than that of the ‘expected’ scores, the experi- 
mental factor is indicated as having a differential potency in 
contributing to growth.® 

The use of the regression technique was especially appropriate 
here, since it was recognized that there was a positive correlation 
between academic aptitude or intelligence, particularly verbal 
ability, and scores on the retailing examination. This technique 
served to eliminate the influence of differences in intelligence, and 
provided a comparison between predicted and actual scores. The 
differences between means attained in the various tests were 
divided by the standard errors of the differences in order to 
determine the t-ratios. 

Of interest also in analyzing the data is the question of the 
magnitude of the relationship between achievement in retailing 
and the several factors to be isolated: retail training, work experi- 
ence, and college education. Some measure of correlation was 
needed here to indicate the strength of relationship between 
achievement and each one of these factors with the other factors 
held constant. 

A statistical treatment highly suitable for this purpose is the 
Kelley correlation ratio e.7. This is particularly appropriate in 








182 The Journal of Educational Psychology 


the present situation, since it is not affected by disproportion te 
numbers of cases in the various groups. As described by Peters 
and Van Voorhis,* when corrected, ¢ has a standard meaning free 
from bias and independent of the size of the sample and of the 
number of classes into which the sample is divided. It is shown 
to have all the merits of analysis of variance, and in addition is 
interpreted positively rather than negatively as in the case of the 
t- and F-scores involving the null hypothesis. In this study, 
€ was used to estimate the correlation between achievement on 
the retailing examination and each factor being measured. 

The use of the two techniques described above was planned to 
eliminate differences in intelligence as a possible uncontrolled 
variable, and left age and sex as the only other predictable 
external factors to be controlled. Recognizing that the age 
levels of the two subgroups (subjects with four years of college 
and those with two years of college) differ by several years by 
definition, critical ratios for the differences in ages of the various 
groups were calculated, and showed that age was minimized as a 
variable factor here. The factor of sex difference was eliminated 
because all those being tested were women. 


AN ANALYSIS OF THE DATA 


Although the detailed statistical procedures followed will not 
be described here, a summary will be given of the steps followed 
in working out each of the statistical treatments. 

In using the Peters’ regression technique, the work of analyzing 
the data was that of testing for the significance of differences 
between the control group (Group 1), and the experimental 
groups (Groups 2, 3, and 4), when the groups had been equated 
for intelligence. The first step in this procedure was to calculate 
for the control group the product-moment correlation between 
intelligence as indicated by their scores on the Wonderlic Per- 
sonnel Test and their achievement in each of the five objectives as 
represented by the five part-scores. The significance of these 
correlations was then tested. ' 

In predicting the score each member of the experimental groups 
would be expected to make on the basis of his learning ability, a 
regression equation® was used in score form. Through use of 
a regression equation, predicted scores were obtained from 








Appraisal of Professional Education for Business 183 


which differences between actual and predicted scores could be 
calculated. 

In calculating the significance of the differences between pre- 
dicted and actual scores, it was necessary also to compute the 
standard error of the mean of the differences. Because in this 
situation the mean of the matching scores of the control group 
differs from the mean of the experimental groups, a formula which 
provides an adjustment for the difference in means between the 
groups must be used.!° 

In summary, these conclusions were drawn from this analysis: 

1) Subjects with retail training and no work experience per- 
formed significantly better on all objectives, thus indicating the 
effectiveness of the School in training for these selected objectives. 

2) Subjects with work experience and no training performed 
significantly better than the control group in respect to applica- 
tion of principles to retail management, identification of retailing 
facts, and cultivation of skills in retailing mathematics. On the 
other hand, work experience did not significantly improve 
the level of achievement of these subjects in interpreting data 
relating to consumer needs and comprehension of the nature of 
distribution. 

3) Subjects with both training and work experience generally 
performed significantly better than the control group, although 
there were two exceptions. Subjects with two years of college 
did not show significant differences in respect to interpretation of 
data relating to consumer needs and cultivation of skill in the use 
of retailing mathematics. 

A detailed comparison of these results with the predictions pro- 
posed earlier as a theory of retail instruction can best be made 
when the analysis of data has been completed. 

The steps described above using the Peters’ regression tech- 
nique indicate clearly the level of significance of the mean differ- 
ences in achievement scores when the groups are equated for 
intelligence, but they do not identify the relative strength of the 
factors being measured. This can be done through use of the 
Kelley Epsilon coefficient, which can provide a comparison of 
the variance within groups and the total variance. Unlike the 
correlation ratio, n, for which the variances are actual ones cal- 
culated from the data being used, the variances are for population 








184 The Journal of Educational Psychology 


estimates based on the sample, and provide the unbiased feature 
of «.!! 

In estimating the strength of relationship between achievement 
scores in the retailing examination and the three factors to be 
measured—training, work experience, and college education—it 
is possible to set up direct comparisons of various pairs of subject 
groups as a means of isolating each of the factors. For example, 
to find the strength of work experience, group 1 (no work, no 
training) was compared with group 3 (work, no training); and 
group 2 (no work, training) was compared with group 4 (work, 
training). Similar sets of comparisons were made for the other 
factors measured, thus holding constant the factor present in or 
absent from both groups in each set being compared. 

It is possible also to arrange the data so that two of the factors 
to be measured can be isolated simultaneously while the strength 
of the third factor is being measured. In this treatment, subjects 
can be sorted into classes on the basis of some known factor, and 
then subsorted into subclasses. The variance of these subclasses, 
then, will be due to factors other than those which determine the 
class sorting. This use of partial « was regarded as more mean- 
ingful in this experiment, and will be described through use of one 
example. 

In‘ order to isolate the factor of work experience and measure 
its strength in relation to the other variables, it was necessary to 
sort and then to subsort the data into the following form: 











Two Years College Four Years College 
Training No Training Training No Training 
No No No No 
Work Work Work Work Work Work Work Work 














This arrangement of data makes it possible to calculate a partial 
e for work experience on achievement, with training and educa- 
tion held constant.’ The level of significance of the e can be 
found by inspection of the tables for e?. 

As indicated earlier, « provides a more unbiased estimate than 











Appraisal of Professional Education for Business 185 


n, since it corrects for small numbers in the various classes caused 
by dividing data into too many categories. To make this correc- 
tion, an estimate is needed of the amount by which the correlation 
being made would be improved if more categories had been used. 
It is necessary, then, to assume a rectilinear regression within each 
broad class, and to assume that there is a normal distribution. 
The linear correlation between index values and variates in a nor- 
mal distribution can then be substituted in a formula" to find the 
corrected ¢«. This statistic can then be used to estimate the 
strength of the relationship inherent in the normally distributed 
population from which this relatively coarsely grouped data come. 
This estimate is based on the assumption that increasing the 
number of categories would actually increase e, and that there is 
a positive linear correlation between the number of categories 
up to about fifteen and the size of the e obtained. In effect, this 
e can be used to indicate an estimate of what « would be if the 
data had been presented in terms of a large number of groups 
which had the factors of training, work experience, and college 
education, each in differing amounts. 

In comparing the results of this « treatment with the Peters’ 
technique results, it is of interest to note that a substantially 
larger number of the Peters’ comparisons were significant than e 
calculations. In the latter, no attempt was made to hold con- 
stant the effect of intelligence, and some of the groups of scores 
which in raw form are without significant differences are shown to 
have significance when the effect of intelligence is held constant. 
These conclusions were drawn from the ¢ treatment: 

1) Subjects with training performed better on all objectives 
than those with work experience, with the exception of the part- 
score involving skill in the use of retailing mathematics, in which 
the strengths of training and of work experience were approxi- 
mately equal. 

2) In respect to the factor of work experience, the strengths of 
part-scores III and V (identification of retailing facts and skill in 
the use of retailing mathematics) were strongest, thus suggesting 
that work experience itself is more effective in these areas than in 
the others measured. 

3) In respect to the factor of college education, four-year col- 
lege people were better prepared to be effective retail executives 
than those subjects with two years of college except in the case of 








186 The Journal of Educational Psychology 


part-score I (application of principles of retail management) 
where no significant difference was shown to exist. 


CONCLUSIONS 


Earlier in this article a theory of retail instruction was ad- 
vanced, and it is now possible to test the soundness of this theory 
by comparing with it the results of the analysis of data. This 
must be qualified, however, by the fact that not all of the objec- 
tives suggested as desirable were currently being taught at the 
Simmons College Prince School of Retailing. In terms of the 
objectives measured, the following general conclusions were 
drawn: 

1) The theory of retail training proposed that formal training 
alone can be more effective than work experience alone in teach- 
ing those objectives that: (1) derive from the concept that retail- 
ing is one phase of our distribution system with broad implications 
in our social and economic life, and (2) concern general principles 
and knowledge that go beyond the scope of any one business 
institution, was substantiated in respect to these objectives: (1) 
application of principles of retail management, (2) interpretation 
of data relating to consumer needs, (3) identification of retailing 
facts, and (4) comprehension of the nature of distribution. 

2) The theory of retail training proposed that work experience 
alone can be more effective than formal training alone in teaching 
specific job techniques was not substantiated in respect to the 
objective: cultivation of skill in the use of retailing mathematics. 
Formal training alone was found to be approximately equal in 
effectiveness to work experience alone in this area. 

3) The presumption that the combination of formal training 
and work experience together would prove more effective than 
either training or work experience alone was not consistently 
borne out, possibly because of limitations in the size of the sample 
studied. Although subjects in this group performed significantly 
better than the control group in the case of all but two subgroups, 
these subjects did not consistently show significantly greater 
differences as compared with subjects with training or work 
experience alone. Many of the subjects tested had been working 
since graduation in personnel positions which did not directly 
involve customer contact or use of merchandising mathematics, 
and the data suggest that as with training in other fields, people 








Appraisal of Professional Education for Business 187 


remember best those kinds of learning with which they are most 
directly interested or employed. 

4) Of the five objectives measured, work experience was shown 
to be relatively the most effective in: (1) skill in the use of retail- 
ing mathematics and identification of retailing facts. Work 
experience was least effective in teaching the comprehension of 
the nature of distribution. As indicated above, work experience 
equalled formal training in effectiveness only in respect to skill in 
the use of retailing mathematics. 

5) Subjects with four years of liberal arts college education 
were better prepared to be effective retail executives than those 
subjects with two years of liberal arts college work, except in 
the case of the objective: application of principles of retail 
management, where no significant relationship exists. 

6) The results of this investigation emphasize the sound 
position of the school of retailing in providing formal training 
in the areas now being taught. Although it is clear that skill 
in the use of retailing mathematics can be learned equally effec- 
tively on the job, formal training in this objective can help new 
workers to avoid the incidental fumbling caused by lack of 
mathematical training before employment. The fact that the 
school does not now completely cover in its training the range 
of objectives considered from several viewpoints to be desirable, 
suggests the desirability of expanding the range of objectives 
taught, particularly in respect to the implications of retailing as 
a part of our social and economic life. 


REFERENCES 


1) Jean Cozine, An Evaluation of the Foods and Nutrition Work as Offered 
by the State-Supported Colleges and Universities of Missouri. Unpublished 
Ph.D. dissertation, Department of Education, University of Chicago, 1946. 

2) G. F. Kuder and M. W. Richardson, ‘‘The Theory of the Estimation of 
Test Reliability,’’ Psychometrika, 1 (1937), 158. See also L. J. Cronbach, 
“Test ‘Reliability’: Its Meaning and Determination,’’ Psychometrika, x1 
(1947), 1-16. 

3) C. C. Peters, “A Method of Matching Groups for Experiment with no 
Loss of Population,’’ Journal of Educational Research, xxxtv (1940), 70-74. 

4) E. F. Lindquist, ‘Statistical Analysis in Educational Research,” 
Chapter VI, Boston: Houghton, Mifflin Company, 1940. 

5) C. C. Peters and Others, ‘‘ Research Methods and Designs,”’ Review of 
Educational Research, xv (1945), 377-93. 

6) C. C. Peters and W. R. Van Voorhis, Statistical Procedures and Their 











188 The Journal of Educational Psychology 


Mathematical Bases, p. 463. New York: McGraw-Hill Book Company, 
1940. 

7) T. L. Kelley, ‘‘An Unbiased Correlation Measure,’”’ Proceedings of the 
National Academy of Science, xx1 (1935), 554—59. 

8) Peters and Van Voorhis, op. cit., p. 323. 

9) Ibid., pp. 111 and 463. 

10) Jbid., p. 465. 

11) Ibid., pp. 312-327. 

12) Ibid., p. 326. 

13) Ibid., pp. 323, 398. 











BOOK REVIEWS 


T. G. AnpreEws, Editor. Methods of Psychology. New York: 
John Wiley and Sons, Inc, 1948, pp. 716. $5.00. 


Experimental psychology, as commonly thought of, includes 
that aspect of psychology which is concerned with laboratory 
experimentation especially in such subjects as sensation, per- 
ception, learning, and the like. This delimitation of the field 
has a certain historical raison d’etre, inasmuch as it represented 
the first clear-cut separation of psychology from the broad field 
of philosophy. However, since the early 1930’s there has been 
growing recognition that experimental psychology is a method 
of approach rather than a content field. Increasingly the experi- 
mental attitude has been introduced into the solution of practical 
problems in educational, clinical, industrial, and social psy- 
chology. Yet, an analysis of textbooks or laboratory and class- 
room course outlines entitled ‘experimental psychology’ still 
reveals the subject rather than the method approach. For the 
first time in the book being reviewed, there is available to 
advanced students in psychology, and in those related fields in 
which investigation is frequently psychological in nature, an 
exposition of methods of investigation as distinct from the prob- 
lems being investigated, or the results of such investigations. 

In the words of the editor: ‘‘this book has been written in 
recognition of the need for greater emphasis on methodology in 
the training of students in psychology.”’ Dr. Andrews and 
twenty-one contributors discuss and describe the major tech- 
niques of psychological investigation. In the initial chapter the 
editor analyses the theoretical problems involved in the design 
of psychological experiments. In succeeding chapters his col- 
laborators describe experimental design, techniques, and appa- 
ratus used in such areas as learning, psychophysics, sensation 
and perception, animal behavior, feeling and emotion, neuro- 
psychology, motor functions, aptitude, intelligence and person- 
ality appraisal, clinical psychology, child behavior, and social 
behavior. As is usual in collected volumes of this sort, the 
separate contributions vary in quality and in the method of 
approach. However, all of the authors have satisfactorily 
achieved the goal of emphasising methodology, and of subordi- 

189 








190 The Journal of Educational Psychology 


nating content of investigation to the réle of illustration. Each 
chapter provides a substantial bibliography of the subject being 
discussed. C. M. Louttit 


University of Illinois, Galesburg 


WituraM §. Gray. On Their Own in Reading. Chicago: Scott, 
Foresman and Co., 1948, pp. 268. 


One of the most important problems in teaching reading in the 
elementary school is achieving independence in word perception. 
All too frequently teachers fail to develop in their pupils adequate 
techniques for dealing with new words in context. This little 
book describes in detail a program for the development of word- 
attack skills in the elementary grades. It is both a text and a 
guide, giving practical help to teachers for their day-to-day needs. 

After discussing changing viewpoints on teaching word per- 
ception, consideration is given to the réle of word perception in 
reading and how printed words are identified. This is followed 
by analyses of the five major aids to word perception: context 
clues, word-form clues, structural analysis, phonetic analysis, 
and dictionary use. Part Two of the text presents a sequential 
program for teaching these five aids to word perception. This 
is done in a complete and practical manner. First reading of 
the book might seem to indicate great complexity in the pro- 
gram. Although effective word perception is complex, the 
author has so arranged the pattern of his program that one step 
naturally follows the preceeding one. There is of course some 
overlapping in teaching the steps. 

Two emphases which seem particularly noteworthy to the 
reviewer are: (1) Consonant substitution aids in attack on new 
words, and (2) structural analysis. Somewhat greater emphasis 
might have been given to keeping the training intrinsic to con- 
text as much as possible. A poorly trained teacher, attempting 
to apply the methods, might drift into too much word drill as 
such. Although any reading program should be individualized, 
the author might have indicated somewhat more definitely the 
grade levels where in general the various steps in his sequential 
program are to be initiated. 

This text is based upon sound educational and psychological 








Book Reviews 191 


principles and is very practical. Its use should promote a better 
teaching of reading than we have had in the past. 
Mies A. TINKER 
University of Minnesota 


Knicut Dunuap. Personal Adjustment. New York: McGraw- 
Hill Book Co., Inc., 1946, pp. 446. 


“The principles on which this book is based are products of 
my work with maladjusted persons during a period of over forty 
years.”” The book is written out of the author’s experience in 
presenting the materials to classes of undergraduates. ‘‘I have 

. revised and rewritten my lectures and now present them so 
that general readers, as well as students, may have an oppor- 
tunity to read them.” 

The author has treated the problem of adjustment strictly 
from a psychological standpoint. In this respect it bears some 
resemblance to Burnam’s The Normal Mind. Reference is made 
to psychoanalysis only by way of criticism. In fact the book 
presents quite a scholarly ‘exposé’ of psychoanalysis in Chapter 
XV. The book will probably be read with considerable interest 
by clinical psychologists interested in private practice. 

Two chapters, II and III, treat the subject of learning in about 
the way one would find in an elementary text in general psy- 
chology. They are rather prosaic and do not add much to the 
appeal of the book. Chapters IV and V treat classification, 
description, and etiology of mental disorders. They present 
information the student should have. Other chapter headings 
are as follows: Important Features of Neurotic Maladjustments; 
Goals or Objectives and Their Attainment; Readjustment; Nega- 
tive Practice and Its Applications; Sex and Its Functions in 
Human Life; Marital Adjustment and Maladjustment; Choosing 
a Mate; The Care and Training of Infants and Children; Various 
Minor Maladjustments. ‘The more practical side of the book is 
found in these chapters. 

Certainly psychologists and students will be interested in learn- 
ing what Professor Dunlap has to say on this subject. The book 
makes a fairly good case for the treatment of the disordered or 
maladjusted by psychologists. J. B. Stroup 

State University of Iowa 








192 The Journal of Educational Psychology 


K. W. Vauaun, Editor. National Projects in Educational Meas- 
urement. Series I—Reports of Committees and Confer- 
ences. Washington: American Council on Education, 1947, 
11, No. 28, pp. 80. 


This is a report of the 1946 invitational conference on testing 
problems held in New York City. Papers presented and dis- 
cussed at the conference include (1) the measurement book 
project of the American Council on Education, (2) units and 
norms in educational measurement, (3) validity of educational 
tests, (4) logical dilemmas in the estimation of reliability, (5) the 
projects of the graduate record office, (6) project in the selection 
of personnel for public accounting, and (7) a nation-wide high- 
school testing program. The papers indicate a dissatisfaction 
with present practices and a looking forward to the providing of 
more adequate materials for students and research workers. 

Mies A. TINKER 


University of Minnesota 











